PROFESSIONAL.  LIBRARY 
AMERICAN  PRINTING  HOUSE  FOR  THE  BLIND 
LOUISVILLE,  KENTUCKY 


Digitized  by  the  Internet  Archive 
in  2019  with  funding  from 
American  Printing  House  for  the  Blind,  Inc. 


https://archive.org/details/proceedingsofint01lesl 


The  late  Gustavus  E.  Pfeiffer,  an  outstanding  philanthropic  leader, 
was  concerned  particularly  with  the  problems  of  blind  persons.  He 
served  as  a  Trustee  of  the  American  Foundation  for  the  Blind  from 
1932  to  1953.  The  Foundation  which  bears  the  name  of  Mr.  and 
Mrs.  Gustavus  E.  Pfeiffer  felt  that  the  subject  matter  of  these 
Proceedings  was  clearly  an  extension  of  his  own  interest  in  sound 
research  applied  to  the  cause  of  human  welfare.  This  book  is  dedi¬ 
cated  to  him  and  to  our  recommitment  to  his  ideals. 


30,2.  H  I 

Otmer  3 

V.l 

PROCEEDINGS  OF  THE 
INTERNATIONAL  CONGRESS  ON 

TECHNOLOGY 
AND  BLINDNESS 


GENERAL  CHAIRMAN:  JEROME  B.  WlESNER 

associate  chairmen:  M.  Robert  Barnett 

N.  Charles  Holopigian 
editor  of  the  Proceedings :  Leslie  L.  Clark 


VOLUME  I 


Panel  I  —  Man-Machine 

Systems 

Chairman:  Samuel  J.  Mason 


Panel  V— Plenary  Session 

Chairman  :  Charles  Hedkvist 


PROFESSIONAL  LIBRARY 
AMERICAN  PRINTING  HOUSE  FOR  THE  BLIND 
LOUISVILLE.  KFNT;ir-,,r 


New  York  1963 

THE  AMERICAN  FOUNDATION  FOR  THE  BLIND 


/ 5  c.  \ 


u3 -qs 

I 

(_  «•  ? 

COPYRIGHT  ©  1963 
AMERICAN  FOUNDATION  FOR  THE  BLIND,  INC. 
15  WEST  16th  STREET,  NEW  YORK  11,  N.  Y. 

SECOND  EDITION 


PRINTED  IN  THE  UNITED  STATES  OF  AMERICA 


THE  WILLIAM  BYRD  PRESS,  INC. 


FOREWORD 


For  centuries,  civilized  man  has  regarded  blindness  with  something  of  the 
irrational  belief  of  primitive  peoples :  that  it  was  something  more  than  a  dis¬ 
ability,  more  than  the  loss  of  one  of  the  sensory  faculties  of  the  body.  This 
belief,  fed  on  superstition  and  ignorance,  created  attitudes  toward  blind¬ 
ness  that  were  as  blind  as  the  disability  itself.  Within  the  last  century,  how¬ 
ever,  we  have  witnessed  a  renewed  effort  to  consider  in  what  ways  actual 
substitutes  for  natural  vision  could  be  invented  or,  if  that  ideal  remains 
unachieveable,  at  least  to  reduce  the  social  isolation  and  physical  handicaps 
imposed  by  blindness  on  the  individual. 

The  full  impact  of  the  introduction  of  science  into  these  areas  has  not 
yet  been  felt.  The  number  of  involvements  over  the  past  several  decades  is 
impressive,  and  the  specific  achievements  to  the  present  are  many.  But  the 
understanding  of  the  basic  principles  underlying  sensory  loss  is  still  far 
from  satisfactory,  nor  has  it  penetrated  deeply  enough  into  the  scientific 
community.  Nor  has  the  knowledge  of  what  has  already  been  accomplished, 
and  what  already  exists  to  assist  the  blind  person,  attained  general  inter¬ 
national  recognition. 

An  appreciation  of  these  several  facts  led,  during  the  past  decade,  to  a 
conviction  that  some  means  should  be  found  to  improve  both  the  quantity 
and  the  quality  of  basic  research  into  sensory  loss,  with  special  reference  to 
blindness.  Consequently,  the  American  Foundation  for  the  Blind  under¬ 
took  an  international  survey  of  technical  devices  during  the  period  of  1960 
to  1962  which  not  only  pinpointed  specific  gaps  in  our  knowledge  and  tech¬ 
nology,  but  planned  also  for  the  convening  of  an  international  assembly  of 
scientists  to  explore  the  gaps  in  fundamental  knowledge  in  detail. 

This  volume  is  the  first  of  four  which  have  issued  from  the  papers  and 
discussions  of  that  Congress,  which  was  held  in  New  York  City  on  18 
through  22  June  1962.  Its  publication  was  made  possible  through  a  grant 
from  the  Gustavus  and  Louise  Pfeiffer  Research  Foundation,  to  whom  we 
express  grateful  acknowledgment. 

While  the  American  Foundation  for  the  Blind  was  privileged  to  carry  out 
the  responsibility  of  the  survey  and  the  details  of  the  Congress,  it  is  ap¬ 
parent  that  no  such  study  is  ever  accomplished  by  just  one  organization.  The 
organizations  and  individuals  who  cooperated  with  us  included  virtually 


vi  Foreword 

all  those  who  were  aware  of  a  contribution  they  had  to  make;  all  of  them 
should  be  commended  for  a  fine  example  of  an  unselfish  willingness  to  share. 

Projects  of  this  type  are  also  expensive.  The  American  Foundation  for 
the  Blind  wishes  particularly  to  acknowledge  the  financial  support  of  the 
Office  of  Vocational  Rehabilitation,  Department  of  Health,  Education,  and 
Welfare;  the  National  Science  Foundation;  the  Irene  Heinz  Given  and  John 
LaPorte  Given  Foundation;  Howe  Press  of  Perkins  School  for  the  Blind, 
and  the  Gustavus  and  Louise  Pfeiffer  Research  Foundation.  Whatever 
degree  of  success  the  survey  and  the  Congress  enjoyed  would  certainly  have 
been  impossible  without  the  assistance  of  these  organizations. 

The  American  Foundation  for  the  Blind  also  wishes  to  recognize  the 
invaluable  cooperation  of  the  American  Foundation  for  Overseas  Blind, 
and  the  World  Council  for  the  Welfare  of  the  Blind,  in  helping  these  projects 
in  many  ways.  Our  several  national  and  international  bodies  in  the  field  of 
education,  rehabilitation,  and  social  improvement  of  the  lives  of  blind 
persons  have  pledged  that  this  effort  is  not  simply  a  burst  of  enthusiasm, 
but  that  its  real  findings  will  guide  many  of  our  future  activities. 

The  present  volume  deals  with  the  question  of  relationship  between 
man  and  machine — in  the  practical  realization  of  an  amplification  of  the 
powers  of  the  individual  visually  impaired,  to  bring  him  nearer  the  goal 
of  a  normal  and  happy  life.  Other  volumes  explore  the  sensory  loss  in  detail 
(Volume  II),  the  classes  of  aids  and  immediate  practical  applications  for 
the  visually  impaired  (Volume  III),  and  a  catalog  of  aids  and  devices 
available  from  over  the  world  (Volume  IV) . 

M.  Robert  Barnett 

Executive  Director 

American  Foundation  for  the  Blind 


CONTENTS 


v  Foreword.  M.  Robert  Barnett 

PANEL  I— MAN-MACHINE  SYSTEMS 
1  Introduction.  Samuel  J.  Mason 

SECTION  I - MOBILITY  AND  MOBILITY  DEVICES 

7  The  Requirements  for  Successful  Travel  by  the  Blind.  John 
K.  Dupress 

13  Techniques  of  Information  Generation:  The  Cane.  Thomas 
B.  Sheridan 

35  Active  Energy  Radiating  Systems:  The  Bat  and  Ultrasonic 
Principles  I.  /.  David  Pye 

49  Active  Energy  Radiating  Systems:  The  Bat  and  Ultrasonic 
Principles  II;  Acoustical  Control  of  Airborne  Intercep¬ 
tions  by  Bats.  Frederic  A .  Webster 

137  Active  Energy  Radiating  Systems:  Ultrasonic  Guidance  for 
the  Blind.  Leslie  Kay 

157  Active  Energy  Radiating  Systems:  The  80-Channel  Elek- 
troftalm.  Witold  Starkiewicz  and  Tadeusz  Kuliszewski 

167  Active  Energy  Radiating  Systems:  An  Electronic  Travel 
Aid.  Thomas  A.  Benham  and  J.  Malvern  Benjamin 

111  Active  Energy  Radiating  Systems:  The  Electronic  Cane.  Rob¬ 
ert  J.  Gibson 

183  Passive  Systems:  A  Proposed  Stereo-Optical  Edge  Detector. 
Avery  R.  Johnson 

187  Passive  Systems:  Ambient-Light  Obstacle  Detector  with 
Tactile  Output.  Bertil  Jacobson 


193  Passive  Systems:  A  Magnetic  Compass  and  Straight  Course 
Indicator  for  the  Blind.  Bertil  Jacobson 

199  Passive  Systems:  Straight  Line  Travel  Aid  for  the  Blind. 

James  C.  Swail 

SECTION  II - READING  MACHINES 

205  Some  Design  Criteria  for  a  Blind  Reading  Aid.  Maxwell  B. 
Clowes 

215  Review  of  the  Major  Functional  Concepts  of  Reading  Ma¬ 
chines.  Hans  A.  Mauch 

227  Letter  Scanning  for  Character  Recognition.  Samuel  A.  Scharff 

245  Some  of  the  Logic  Presently  Used  in  Character  Recognition 
Machines.  Jacob  Rabinow 

261  Stage  of  Development  of  Automatic  Character  Recognition 
and  Complex  Reading  Machines  for  the  Blind  in  Europe. 
Helmut  Kazmierczak 

279  Visual  Pattern  Recognition:  The  Problems  and  Promise. 

Oliver  G.  Self  ridge 

289  Psychological  Considerations  in  the  Design  of  Auditory  Dis¬ 
plays  for  Reading  Machines.  Michael  Studdert-Kennedy  and 
Alvin  M.  Liberman 

305  Experimental  Studies  of  Human  Factors  in  Perception  and 
Learning  of  Spelled  Speech.  Milton  F.  Metfessel 

309  Tactual-Kinesthetic  Perception  of  Information.  James  C. 
Bliss 

325  Possible  Uses  of  a  Printed  Braille  Reader  with  Spelled 
Speech  Output.  Michael  P.  Beddoes 

343  The  Development  and  Evaluation  of  the  Battelle  Aural 
Reading  Device.  John  L.  Coffey 

361  The  Use  of  the  Remaining  Sensory  Channels  (Safe  Ana¬ 
lyzers)  in  Compensation  of  Visual  Function  in  Blind¬ 
ness.  M.  /.  Zemtzova,  J.  A.  Kulagin,  and  L.  A.  Novikova 

381  Review  and  Summary.  Franklin  S.  Cooper 


SECTION  III - INDIRECT  ACCESS  TO  THE  PRINTED  PAGE 


393  Automatic  Machine  Translation:  Potentialities  for  Braille 
Encoding.  Victor  H.  Yngve 

403  Automatic  Braille  Reproduction.  Virgil  E.  Zickel 

409  Enhancing  the  Availability  of  Braille.  Robert  W.  Mann 

427  Ste-Re:  A  System  of  Communications.  William  H.  Stevenson, 
William  J.  Reaves,  and  Wallace  A.  Warren 

455  Problems  in  Generating  Spoken  Outputs.  Edward  E.  David, 
Jk 

463  Summary  and  Comment.  Oliver  G.  Selfridge 

PANEL  V— PLENARY  SESSION 

471  Introduction.  Charles  He dkvist 

473  Summary  of  Proceedings  of  the  International  Congress  on 
Technology  and  Blindness.  John  K.  Dupress 

481  Demographic  and  Social  Aspects  of  Blindness.  Eric  Josephson 

497  The  Research  and  Demonstration  Grants  Program  of  the 
Office  of  Vocational  Rehabilitation.  Stephen  P.  Quigley 

507  Stimulation  of  Research  in  Technology  and  Blindness.  Mil- 
ton  D.  Graham 


515  Index 


Panel  I  —  Man-Machine  Systems 


section  i — Mobility  and  Mobility  Devices 

Chairman  :  Heinz  E.  Kallmann 
Consulting  Engineer, 
New  York,  New  York 


section  ii — Reading  Machines 

Chairman:  Howard  Freiberger 

Veterans  Administration, 
New  York,  New  York 


section  hi — Indirect  A ccess  to  the  Printed  Page 

Chairman:  Edward  L.  Glaser 

Burroughs  Corporation, 
Paoli,  Pennsylvania 


<fv» 


INTRODUCTION 


SAMUEL  J.  MASON 
Massachusetts  Institute  of  Technology 
Cambridge,  Massachusetts 


This  Panel  is  concerned  with  man  and  machine  systems;  it  is  divided  into 
three  Sections.  In  my  remarks  here  I  decided  not  to  go  into  the  details  of  the 
papers  that  follow,  but  to  devote  myself  to  a  few  general  remarks.  First, 
some  remarks  on  my  own  interests  in  this  area. 

I  am  a  communications  engineer,  and  I  came  into  this  business  of 
sensory  aids  research  fairly  recently.  About  three  or  four  years  ago  one  of 
my  thesis  students  was  casting  about  for  a  suitable  topic  and,  for  no  special 
reason,  we  decided  to  look  into  the  problem  of  reading  machines.  Dr.  Clif¬ 
ford  Witcher  had  been  doing  some  work  at  MIT  on  sensory  aids  for  some 
years  before,  but  his  tragic  death  in  1956  more  or  less  terminated  that 
project.  When  he  was  at  MIT  I  was  not  myself  particularly  interested  in 
sensory  aids.  Since  becoming  interested  in  this  work,  however,  I  have  been 
trying  to  decide  why  I  was  not  interested  at  that  time,  and  also  why  I  am 
very  interested  indeed  at  the  present  time.  The  only  special  claim  I  can 
make  is  that  I  have  no  solutions  to  the  problems  of  the  field  whatsoever, 
and  that  I  am  really  interested  in  the  whole  domain  of  phenomena  involved. 
Yet  I  am  sure  that  my  interest  will  be  maintained.  I  do  know  that  Dr. 
Jerome  Wiesner  and  Dr.  Norbert  Wiener  fostered  such  work  at  MIT  in 
various  ways,  and  I  know  that  Jerry  has  maintained  a  long-term  interest  in 
sensory  aids  problems. 

I  was  fortunate  at  MIT  in  being  able  to  form  a  small  research  group 
on  sensory  aids.  Due  to  some  general  research  support  to  the  Research 
Laboratory  of  Electronics  it  is  easy  to  shift  one’s  activity  with  a  minimum  of 
red  tape.  All  that  I  had  to  do  was  to  let  it  be  known  that  I  would  be  willing 
to  advise  on  theses  in  this  area.  This  is  surely  the  simplest  way  of  forming  a 
research  group!  The  group  has  been  building  up  slowly.  I  find  I  have  to 
spend  much  of  my  time  going  about  for  equipment  and  so  on;  I  find  myself 
divided  up  into  six  jobs  and  suffer  the  same  fate  as  many  of  us  who  find 
that  the  job  we  love  the  most  is  the  one  on  which  it  is  possible  to  spend 
only  a  couple  of  hours  a  week. 

The  first  Section  of  the  Panel  dealt  with  Mobility  and  Mobility  Devices 


1 


2  Man-Machine  Systems 

in  12  different  papers.  Most  of  the  work  reported  on  represented  ideas  and 
devices  for  the  creation  of  probes  with  which  a  person  afflicted  with  a 
sensor}'  loss  could  explore  the  environment.  The  Program  divided  these  up 
into  active  and  passive  types;  the  former  send  out  light,  sound  or  radio 
waves  and  receive  the  reflected  signal,  and  the  latter  look  around  the  en- 

w- 

vironment  through  a  telescope-like  device  and  respond  to  light,  radio,  or 
magnetic  fields.  The  papers  represent  a  good  deal  of  very  fine  work;  they 
were  especially  interesting  to  me  because  they  offered  the  additional  ad¬ 
vantage  of  a  chance  to  meet  their  authors.  It  was  true  that  in  many  cases  I 
knew  in  the  main  what  I  was  soing  to  hear,  but  the  chance  to  meet  with 
people,  and  the  opportunity  to  see  the  unity  in  the  work  being  done  was 
extremely  valuable  to  me  and  I  am  sure  to  everyone  who  attended  these 
sessions. 

As  someone  said  at  the  sessions,  there  are  things  we  do,  things  we  think 
about  doing,  and  the  things  we  dream  about.  If  I  can  dream  a  bit  about  the 
future,  I  would  say  that  one  of  the  most  fascinating  ideas  comes  to  us 
through  some  recent  psysiological  work  on  complex  fields  in  vision.  We  may 
hope  that,  in  addition  to  probing  out  in  a  certain  direction  in  the  environ¬ 
ment,  a  more  complicated  probe  will  supply  further  bits  of  information 
about  the  local  structure  of  the  environment.  It  will  detect  motions,  slopes, 
edges,  and — inferring  from  some  of  the  most  recently-reported  work  in 
hypercomplex  fields — locate  corners  oriented  in  a  certain  way  and  at  certain 
angles.  It  may  be  possible,  in  successive  models  of  such  probes  at  least, 
that  when  one  points  his  instrument  in  a  certain  direction  he  will  see  more 
complicated  additional  information  about  the  structure,  character,  or  pat¬ 
tern  of  that  neighborhood  of  the  field  at  which  he  is  looking. 

We  must  keep  in  mind,  however,  that  any  “probe”  aid  looks  only  in 
one  direction;  we  have  to  remember — not  pessimistically  but  realistically — 
that  this  is  essentially  tunnel  vision  and  that  some  people  with  tunnel 
vision  are  classified  as  legally  blind. 

The  second  Section  was  concerned  with  Reading  Machines.  These 
divide  themselves  into  two  types:  the  direct  translation  type,  and  the 
recognition  type.  The  first  picks  up  black /white  information  with  photo¬ 
sensors  and  translates  it  into  audio  signals  or  some  sort  of  tactual  presenta¬ 
tion  without  intervening  complicated  processing.  The  second  identifies  a 
printed  character  or  other  signal  and  somehow  informs  the  user  of  what  the 
signal  or  character  was;  hopefully  it  will  tell  him  what  syllable  or  word  or 
phoneme  it  read.  There  was  much  discussion  in  this  session  of  the  best 
point  at  which  to  introduce  the  human  being  in  the  data  processing  chain 
of  these  systems.  There  was  also  much  discussion  of  an  “intermediate” 


Introduction 


3 


type  of  machine  which  recognizes  certain  geometrical  features  of  the  symbol 
and  matches  these  to  certain  features  of  speech,  so  there  might  possibly 
result  an  extremely  sophisticated — and  as  yet  unknown — simple  matching 
between  the  printed  page  and  artificial  speech.  Such  a  machine  appears 
structurally  intermediate  in  complexity  between  the  direct  translation  and 
the  recognition  types.  The  interesting  point  was  made,  however,  that  the 
sophistication  required  for  its  eventual  design  might  well  place  it  beyond 
the  complex  recognition  machine. 

The  third  Section  involved  itself  with  the  problem  of  Indirect  Access 
to  the  Printed  Page.  After  a  certain  amount  of  argument  over  braille 
versus  talking  books  versus  braille  modifications  versus  tape,  and  so  on, 
the  main  point  to  me  was  the  importance  of  two  dimensions  of  human 
variability.  I  was  left  with  the  very  strong  impression  that  each  of  these 
means  of  conveying  information  was  useful  and  worth  while.  (Maybe  I  am 
just  an  optimist.)  It  became  obvious  that  different  people  have  different 
needs  in  reading,  for  example,  as  Dr.  Selfridge  rightly  points  out,  a  single 
person  has  a  variety  of  different  needs:  fast  reading  for  lightweight  texts, 
slower  reading  for  more  difficult  material,  and  so  on.  These  differing 
reading  activities  are  different  qualitatively  and  quantitatively  along  some 
same  scale. 

In  trying  to  put  the  matter  of  the  involvement  of  technologists  or 
scientists  with  sensory  deprivation  research  in  a  different  context,  I  came 
up  with  a  few  characterizations  which  I  shall  offer  for  your  consideration. 
These  characterizations  may  sound  a  little  facetious,  but  I  have  a  point  to 
make.  For  my  model  I  kept  in  mind  the  characters  in  Schulz’s  cartoon 
strip  “Peanuts,”  for  he  displays  mature  human  manners  in  a  humorous  way 
by  letting  the  words  come  out  of  the  mouths  of  children.  Children  are  often 
more  truthful,  perceptive,  and  realistic  than  we  adults.  Thus  I  would  have 
the  technologists  say  to  blind  persons,  or  to  others  in  the  sensory  aids 
effort,  “I  might  want  to  come  into  your  back  yard  and  play  with  you.”  Some 
of  the  ways  in  which  people  might  want  to  go  “.  .  .  into  the  back  yard  and 
play  .  .  .”  with  them  may  be  illustrated  by  the  three  following  quotations: 

(1)  “I  want  to  come  into  your  back  yard  and  teach  you  to  play 
some  games  that  I  invented.  I  am  sure  you  will  like  them  because 
my  intuition  tells  me  so.” 

Or, 

(2)  “I  want  to  come  into  your  back  yard  and  show  you  some 
beautiful  toys  I  have  built.  They  may  not  work  in  your  back  yard, 
but  they  are  so  beautiful  that  it  doesn’t  matter.” 


4  Man-Machine  Systems 
Finally, 

(3)  “I  want  to  come  into  your  back  yard  and  analyze  the  very 
blinking  hell  out  of  your  games  under  conditions  carefully  specified 
by  me.” 

As  you  can  see,  there  is  a  wide  variety  of  people  who  want  to  “come 
into  the  back  yard  and  play.”  My  attitude  would  be  that  this  is  good:  they 
are  all  interested;  they  want  to  help;  and  their  ideas  are  valuable.  I  would 
not  be  concerned  or  worried  about  anyone’s  motives  in  going  into  the 
sensory  aids  business.  I  remember  that  here,  as  in  any  other  business,  there 
are  even  bad  motives  that  lead  to  good  results.  Many  people  are  no  doubt 
attracted  to  this  work  or  to  rehabilitation  work  in  an  attempt  to  deal  with 
some  personal  guilt  complex  (this  may  sound  Freudian,  and  it  is)  but  this 
is  not  necessarily  a  bad  thing.  If  it  results  in  highly  motivated,  energetic 
work  along  these  lines  the  contribution  is  a  welcome  one.  It  is  only  natural 
to  push  one’s  own  vested  interest — and  this  is  where  energetic  activity  comes 
from.  Interchange  and  discussion,  as  we  had  at  this  Congress,  is  invaluable 
as  a  balancing  force;  with  free  discussion  difficulties  can  be  handled  and 
loose  ends  are  taken  care  of. 

The  solution,  in  short,  lies  in  good  communications — as  has  been 
pointed  out  by  a  number  of  persons  here  and  elsewhere.  As  a  communica¬ 
tions  engineer,  I  dislike  the  word  “communications”  perhaps  more  than 
anyone  else  does,  because  I  hear  it  so  often;  but  I  also  think  that  the  way 
to  avoid  trivial  or  duplicative  work  is,  after  all,  a  matter  of  providing  in¬ 
formation.  Without  information  we  cannot  even  decide  what  is  trivial, 
even  though  this  is  a  matter  of  somebody’s  personal  opinion.  You 
cannot  tell  someone  that  you  think  he  is  “all  wrong.”  Rather,  you  ask  him 
whether  he  is  familiar  with  such-and-such  work.  “Oh  no,”  he  says,  “I 
haven’t  read  that.”  So  you  provide  him  with  a  copy,  and  he  says,  “That’s 
very  interesting;  thank  you  very  much.”  You  can  channel  his  efforts  much 
better  that  way. 

In  the  Section  on  Reading  Machines  the  direct  translation  machine  was 
likened  to  a  bicycle,  and  the  character  recognition  machine — a  large 
library  machine — was  likened  to  some  bulky  realization  akin  to  a  Cadillac. 
Now,  I  have  my  own  opinions  on  some  of  the  work  on  reading  machines; 
I  think  some  of  it  is  going  in  the  right  direction,  and  some  in  the  wrong 
direction — I’m  about  as  opinionated  as  anyone  else!  Yet  the  invidious  com¬ 
parison  evoked  the  thought  that  the  Wright  brothers  were  tinkering  with 
bicycles  at  one  time.  I  wonder  whether  they  ever  told  anyone  that  they 


Introduction 


5 


were  going  to  make  a  bicycle  fly?  If  anyone  had  told  me  that  I  would  have 
told  them  that  they  were  crazy! 

Partly  for  this  reason,  and  speaking  for  engineers  who  are  involved  in 
sensory  aids  research,  it  is  invaluable  to  complement  our  knowledge  by 
some  hard  study  in  psychophysics,  sensory  physiology,  and  rehabilitation 
work.  When  I  first  started  advising  theses  in  this  area  I  decided  to  try  the 
blindfold  test.  I  wore  a  blindfold  for  a  couple  of  days.  I  didn’t  find  out  what 
it  was  like  to  be  blind,  of  course,  but  I  did  find  out  a  few  things  about  the 
mechanics  of  getting  about.  My  overwhelming  conclusion  from  that  ex¬ 
perience  was  that  my  irritations  did  not  arise  over  the  mechanics  of  lo¬ 
cating  an  object,  but  rather  from  dear  friends  of  mine  who  inserted  them¬ 
selves  into  my  four-foot  radius  of  influence  and  didn’t  tell  me  their  names, 
or  went  away  without  telling  me  they  were  going,  or  talked  at  me  about 
trivia.  I  wound  up  rather  angry  at  many  of  my  friends.  Then  I  overcame 
this  anger  because  I  realized  that  they  couldn’t  help  it.  They  were  stupid. 
Compassion  won  out.  This  experience  pointed  up  to  me  a  new  facet  of  blind¬ 
ness:  the  need  of  the  compassion  of  the  blind  for  the  nonblind. 

The  meetings  and  the  papers  presented  here  were  for  me  absolutely 
invaluable.  Although  I  have  access  to  many  of  the  other  people  working 
in  areas  fairly  close  to  my  own,  and  I  could  have  travelled  to  see  others, 
I  am  lazy  like  most  people.  I  don’t  make  the  necessary  trip,  and  I  see 
other  things  that  need  doing  first.  Here  I  had  the  opportunity  to  meet  these 
people,  to  talk  with  them  at  an  intimate  level,  to  argue  with  them,  for  they 
are  people  interested  in  the  same  things  as  I  but  from  a  different  viewpoint. 
My  own  pet  ideas  were  jolted  strongly  by  some  of  these  conversations.  My 
own  reaction  to  my  participation  in  the  Congress  was  that  we  should  not 
seek  to  make  conclusions  but  rather  to  stimulate.  As  one  of  the  participants 
said,  this  is  not  a  terminus  but  rather  a  beginning.  It  may  be  a  terminus  of 
one  phase  of  our  work,  but  I  see  great  stimulation  and  much  encourage¬ 
ment  in  the  papers  laid  before  you  on  the  following  pages. 


V 


SECTION  I 


MOBILITY  AND  MOBILITY  DEVICES 

CHAIRMAN:  HEINZ  E.  KALLMANN 
Consulting  Engineer,  New  York,  New  York 


THE  REQUIREMENTS  FOR 
SUCCESSFUL  TRAVEL  BY  THE  BLIND 

JOHN  K.  DUPRESS 

American  Foundation  for  the  Blind,  New  York,  New  York 


Shortly  before  I  joined  the  Foundation  3Vi  years  ago,  I  participated  in 
another  conference  on  technological  research.  A  total  of  20  minutes  was 
allowed  for  discussion  of  complex  mobility  devices  and  reading  machines. 
There  was  no  time  for  discussion  of  the  utilization  of  remaining  sensory 
channels,  or  the  performance  of  intricate  tasks  by  animals  who  have  no 
vision,  or  for  the  ultimate  solution,  namely  organ  transplants  and  intro¬ 
duction  of  stimuli  directly  into  the  central  nervous  system. 

I  would  like  to  point  out  at  this  time,  that  even  if  our  highest  hopes 
have  not  been  achieved  in  the  last  3  Vi  years,  at  least  the  scope  of  our  view 
has  broadened.  I  shall  begin  by  discussing  the  kinds  of  research  projects 
which  have  been  undertaken  in  the  past. 

To  the  best  of  my  knowledge,  a  serious  effort  in  the  development  of 
mobility  devices  was  begun  only  during  World  War  II.  This  was  the  re¬ 
sult  of  studies  by  the  Office  of  Scientific  Research  and  Development.  The 
research  centers  involved  were  those  of:  Haskins  Laboratories,  Stromberg 
Carlson  Company,  Brush  Development  Company,  the  Franklin  Institute, 


7 


8  Man-Machine  Systems 

the  U.  S.  Army  Signal  Corps,  and  the  Columbia  Broadcasting  System.  The 
first  prototypes  used  sonic  radiation.  Later  models  used  supersonic  and 
ultrasonic  energy  with  frequencies  as  high  as  60  to  70  kc.  During  this 
same  period,  the  Signal  Corps  began  to  work  on  a  device  radiating  chopped 
visible  light.  This  was  the  era  of  radar  development — the  exploration  of  the 
environment  with  narrow  and  intense  beams  of  invisible  energy.  The  theory 
was  that  there  is  a  better  chance  of  detecting  objects  admist  clutter  and 
noise  if  one  knows  the  characteristics  of  the  emitted  signals. 

At  the  end  of  World  War  II,  research  on  mobility  devices  for  the  blind 
ceased.  The  war  was  over.  Those  who  lost  their  sight  as  a  result  of  the 
war,  however,  do  not  see  any  better  now  than  they  did  in  1943  or  1944. 
The  feeling  seemed  to  be  that  we  should  not  indulge  in  efforts  during  peace 
time  which  do  not  promise  quick  results. 

Although  government-sponsored  projects  ceased,  one  independent 
effort  was  made  by  Heinz  Kallmann,  who  explored  the  possibility  of  using 
ambient  light.  The  human  visual  and  auditory  systems  do  not  radiate 
energy  to  locate  and  recognize  objects.  They  are  passive  receivers  and 
process  only  information  which  comes  from  outside  the  human  being.  Dr. 
Kallmann  made  an  initial  working  prototype  which  detected  the  presence 
of  objects  and  terrain  changes  illuminated  by  ambient,  visible  light.  Since 
then,  others  have  taken  up  this  approach. 

None  of  the  devices  mentioned  above  is  in  the  hands  of  blind  people, 
nor  is  it  possible  to  predict  when  they  will  be. 

We  have  not  come  very  far  in  terms  of  the  faith  we  must  have  in  scientific 
endeavor  and  creative  imagination.  ‘We’  in  this  case  means  the  policy 
makers  who  decide  how  much  research  shall  be  done,  where  it  will  be  done, 
and  by  whom.  There  was  no  publicly  financed  program  to  develop  mobility 
devices  between  World  War  II  and  the  Korean  conflict.  This  is  more  than 
just  a  coincidence.  Soon  after  the  Korean  conflict  began,  a  small  but  steady 
stream  of  blinded  military  personnel  returned  from  the  Far  East. 

The  first  project  in  a  program  sponsored  by  the  Veterans  Adminis¬ 
tration  was  to  evaluate  the  Signal  Corps’  device  and  suggest  design  criteria 
for  a  more  useful  unit.  I  believe  that  part  of  our  difficulty  lies  in  the  fact 
that  certain  fundamental  questions  were  not  explored  by  the  social  sci¬ 
entists,  rehabilitation  specialists,  and  human  engineers  involved  in  this 
program.  For  one  thing,  there  has  not  been  a  rigorous  inquiry  into  the  re¬ 
quirements  for  successful  travel  by  the  blind. 

In  the  first  place  it  cannot  be  assumed  that  a  blind  person  ought  to  want 
to  travel  someplace  because  of  some  device  or  rehabilitation  program.  Look 


Requirements  for  Successful  Travel  9 

at  the  sighted  population.  Consider  how  many  of  them  walk  any  place!  If 
a  blind  person  is  to  undergo  constant  stress  and  danger  by  traveling  in 
unfamiliar  and  in  familiar  environments,  he  must  have  a  good  reason  to 
do  so.  You  will  find,  therefore,  that  only  when  more  blind  people  secure 
jobs,  fulfill  their  interests,  and  are  integrated  into  society,  will  they  have  a 
readiness  and  capability  for  travel. 

Second,  mobility  differs  from  other  tasks  which  blind  persons  are  re¬ 
quired  to  perform  in  that  it  is  dangerous — both  to  the  blind  person  and  those 
whom  he  encounters  in  his  travel.  The  good  blind  traveler  may  be  a  very 
poor  traveler  in  terms  of  the  injury  he  can  cause  to  other  sighted  people. 
A  good  traveler  must  always  be  careful  not  to  solve  the  human  obstacle 
problem  by  colliding  with  or  tripping  his  sighted  fellows. 

What  information  does  a  blind  person  require  for  travel?  For  pur¬ 
poses  of  instrumentation  design  and  scientific  inquiry,  we  must  include  ob¬ 
stacle  detection  and  location,  the  precise  determination  of  terrain  changes, 
sufficient  detail  about  the  environment  to  know  precisely  where  one  is,  and 
a  mental  map  to  navigate  efficiently  from  origin  to  destination. 

With  vision  there  is  an  abundance  of  redundant  information  available. 
With  severely  impaired  vision  or  total  blindness  there  is  too  little  informa¬ 
tion  for  unaided  travel.  Instrumentation  designers  have  wisely  avoided  at¬ 
tempts  to  construct  visual  substitutes.  In  the  literature  are  mentioned  pat¬ 
tern  recognition  devices  and  memory  store  navigation  instruments.  The 
devices  which  have  been  built  detect  the  presence  of  most  objects  and  pro¬ 
vide  some  range  data.  Some  laboratory  prototypes  react  to  terrain  changes 
of  as  little  as  two  inches  under  most  circumstances,  but  they  do  not  allow 
for  the  precision  required  for  efficient  step-down  or  step-up  detection.  Navi¬ 
gational  devices  small  enough  to  be  carried  by  the  human  are  limited  to 
straight-line  or  direction  indicators  in  the  form  of  radio  and  other  special 
compasses  or  gyros.  It  can  be  seen  readily  that  nearly  all  parts  of  the  task 
must  be  performed  by  the  human  with  his  remaining  sensory  channels  and 
his  mental  resources. 

Can  totally  blind  persons,  or  those  with  very  low  residual  vision, 
travel  without  any  devices?  We  hear  rumors  of  totally  blind  people  who  use 
no  dog,  no  cane,  and  no  human  guide.  I  shall  repeat  an  offer  I  have  made 
at  other  meetings.  I  would  like  to  find  such  a  person  and  try  him  out  under 
conditions  which  are  normally  encountered  in  real  life  situations. 

The  aids  to  mobility  are  the  three  I  just  mentioned.  The  reasonably 
intelligent  human  guide  will  permit  the  individual  to  go  anywhere  quickly, 
with  least  danger,  and  with  virtually  no  skill  required  of  the  blind  person. 


1 0  Man-Machine  Systems 

The  dog  guide  can  solve  obstacle  and  terrain  change  problems,  thus  vir¬ 
tually  eliminating  personal  hazard,  but  the  blind  person  will  still  have  need 
for  proper  orientation  and  navigation.  Finally,  the  cane  user  must  be  alert 
with  his  probe  to  locate  obstacles  and  terrain  changes,  at  the  same  time 
performing  the  tasks  of  orientation  and  navigation.  He  must  also  operate 
with  safety  to  himself  and  others. 

In  designing  instrumentation  so  far,  the  attempt  has  been  to  substitute 
for  the  human  guide,  the  dog  guide,  the  cane,  and  mobility  training.  By 
mobility  training  I  mean  a  formal  program  given  at  a  rehabilitation  center. 
These  courses  last  a  month  or  more  for  the  dog  guide  user  and  12  weeks 
or  more  for  the  cane  traveler.  The  instructors  in  these  centers  are  exper¬ 
ienced  in  teaching  the  blind  how  to  travel  either  with  a  dog  guide  or  with 
a  cane.  I  might  add  that  in  the  case  of  the  cane  traveler,  part  of  this  time  is 
taken  up  by  other  things  which  are  considered  desirable  or  necessary  for 
over-all  rehabilitation. 

Since  we  cannot  readily  substitute  for  the  cane  or  the  dog  guide — and 
no  mobility  device  yet  developed  has  been  able  to  do  this — can  we  add 
something  to  what  these  devices  do?  In  the  case  of  the  dog  guide,  we  might 
want  to  add  navigational  equipment,  perhaps  storage  of  data  to  supplement 
the  mental  mapping  which  the  successful  blind  traveler  must  do.  In  the 
case  of  the  cane,  we  would  want  advance  information  literally  above  and 
beyond  that  which  the  cane  provides.  The  cane  traveler  should  also  have 
more  navigation  and  orientation  information.  A  device  should  not  interfere 
with  useful  cues  already  available  to  the  human  senses.  The  readout  from 
the  device  should  be  synchronized  in  time  with  other  cues.  The  data  from 
the  device  should  not  be  so  complex  that  it  requires  excessive  time  for 
interpretation  preceded  by  extensive  training.  The  traveler’s  attention  must 
be  readily  secured  without  fatigue  or  the  necessity  for  accommodation.  False 
cues  add  to  the  considerable  stress  normally  associated  with  travel  by  the 
blind.  Instrumentation  reliability  is  essential.  Above  all,  there  must  be  fail¬ 
safe  provisions. 

I  do  not  think  researchers  have  paid  enough  attention  to  some  of  the 
very  important  factors  in  man-machine  interaction.  The  people  who  design 
devices  must  therefore  work  closely  with  human  factors  engineers  who 
know  something  about  sensory  deprivation  before  design  parameters  in 
specific  projects  are  determined.  The  instrument  designers  and  human 
engineers  must  also  work  closely  with  mobility  rehabilitation  specialists. 

One  important  point  remains.  Until  that  distant  day  when  science  has 
found  a  substitute  for  natural  vision,  a  blind  person  must  perform  the 


Requirements  for  Successful  Travel  1 1 

mobility  task  with  limited  assistance  from  devices.  In  order  to  train  the 
individual  to  an  adequate  level  of  mobility  capability,  we  must  develop 
sensory  tests  and  training  procedures.  We  must  study  the  emotional,  moti¬ 
vational,  and  other  psychological  factors  involved.  Until  we  have  a  body 
of  satisfactory  data  developed  from  scientific  inquiry  into  the  nature  of 
the  blind  traveler  and  his  complex  mobility  task,  we  will  continue  to  de¬ 
velop  devices  which  are  not  useful  to  blind  persons. 


TECHNIQUES  OF  INFORMATION 
GENERATION:  THE  CANE 


THOMAS  B.  SHERIDAN 
Massachusetts  Institute  of  Technology, 
Cambridge,  Massachusetts 


Raymond  Loewy  has  been  credited  with  saying  that  a  locomotive  is  an 
example  of  something  easy  to  re-design;  a  needle  is  much  more  difficult  to 
re-design.  The  distinction  is,  of  course,  between  someting  that  operates 
more  or  less  by  itself,  independent  of  man;  and  something  that  is  inti¬ 
mately  connected  to  him,  as  a  hand  tool  might  be.  I  would  place  the  cane 
in  this  latter  category. 

My  remarks  in  this  paper  will  concern  the  cane  and  the  use  of  the 
cane.  It  may  or  may  not  be  trite  to  say  that  these  remarks  will  concern  the 
so-called  “man/cane  system.”  I  shall  not  discuss  any  single  unified  research 
program,  but  a  potpourri  of  several  research  projects  carried  out  at  MIT. 

Our  fascination  with  the  cane  problem  began  several  years  ago  when 
we  realized  that  the  cane  was  probably  the  only  fully  accepted  mobility 
aid  except,  perhaps,  for  the  dog.  We  cannot  do  much  about  re-designing 
the  dog,  but  we  can  certainly  learn  something  about  why  the  cane  is  so 
valuable.  Yet,  if  we  choose  to  look  at  the  cane,  we  cannot  expect  anything 
meaningful  from  our  scrutiny  unless  we  include  the  human  user,  too. 

During  the  past  year  we  have  tried  to  look  at  several  aspects  of  this 
interaction.  In  the  first  section  below,  we  shall  present  the  results  of  our 
analysis  of  the  cane  taken  by  itself,  that  is,  without  reference  to  its  environ¬ 
mental  context.  In  the  second  section  we  shall  summarize  the  results  of  an 
engineering  analysis  of  information  acquisition  using  the  cane.  Finally,  in 
the  third  section,  we  shall  present  an  account  of  a  project  which  explored 
the  use  of  an  obstacle  course  for  evaluating  the  mobility  of  the  blind  using 
the  cane. 

VIBRATION  ANALYSIS  OF  THE  CANE* 

The  analysis  we  did  was  on  a  standard  aluminum  cane  provided  by 

*  More  extensive  mathematical  development  of  the  details  in  this  section  will  be 
found  in  Reference  1. 


13 


14 


Man-Machine  Systems 

the  American  Foundation  for  the  Blind.  The  analysis  is  not  complete  be¬ 
cause  the  evaluation  of  the  mathematical  expressions  obtained  becomes 
impracticable  if  we  try  to  duplicate  the  conditions  exactly  in  the  actual  con¬ 
text  of  use  of  the  cane.  It  is  obvious,  therefore,  that  an  application  of  the 
numerical  data,  too,  is  not  possible.  The  results  we  obtained  are  presented 
as  an  example  of  how  one  can  consider  the  cane  from  the  point  of  view  of 
classical  vibration  theory. 


Figure  1  The  Standard  Cane 


The  cane  consists  of  a  length  of  aluminum  tubing  about  50  inches  long. 
It  has  a  lightweight  tip  and  a  curved  handle  of  heavier  tubing.  The  ratio 
of  length  to  diameter  is  about  1  to  100.  One  may  conclude  that  any  sheer 
deflections  would  be  very  small  in  comparison  to  flexure  deflections. 

Compression  waves  are  possible.  The  fundamental  natural  frequency  for 
compression  waves,  however,  is  about  12,000  per  second  (2  kc/sec).  The 
fundamental,  where  E  =  Young’s  modulus  and  p  =  mass/volume  is 


Since  this  frequency  is  presumably  above  the  upper  threshold  of  detecta¬ 
bility  in  humans,  the  cane  may  therefore  be  considered  a  rigid  body  in 
compression,  so  far  as  the  human  operator  is  concerned — but  not  necessar¬ 
ily  so  far  as  instrumentation  is  concerned. 

Neglecting  all  other  modes  but  that  of  flexure,  we  can  describe  the  be¬ 
havior  between  the  ends  of  the  cane  by  the  following: 

tf_y  _  El  tfy  . 

dt2  p  dx*  U 


Figure  2  Left  End  Origin  of  Cane  Flexion 


Information  Generation:  The  Cane  15 

in  which  E  =  Young’s  modulus;  I  =  Area  moment  of  inertia;  p  =  mass/ 
length;  and  t  —  time.  Appropriate  right  end  conditions  would  be  y  = 
/  (0,  and 


Several  sets  of  left  end  conditions  may  be  considered.  The  first  might 
be  that  resulting  from  representing  the  hand  of  the  user  with  considerable 
damping  and  compliance.  For  example,  consider  the  situation  in  Figure  3 


in  which  y  —  O,  and 

El  d2y/dx2  +  (R  d/dt  +  1  /c)(b  +  dy/dx )  =  0 

but, 


therefore, 


dy  n 

'  =  57  =  0; 


El  —  +  g(t)  +  c  =  0  (6) 

ox 

( g  and  c  are  arbitrary).  Natural  frequencies  and  frequency  responses  can 

be  evaluated  by  assuming  f(t)  =  a  sin  wt  andy  =  Y (t)  -Y(x). 

The  second  left  end  condition  assumes  a  fixed  left  end,  which  implies 
that  the  cane  is  held  very  tightly 


Figure  4  Left  End  Condition  2 


1 6  Man-Machine  Systems 

Here  y  =  dy/dx  —  O.  The  natural  frequencies  are  given  in  the  expression 

y  =  2  y(t)yn(x),  with  y(t)  =  a  sin  cot.  (7) 

71—  1 

We  found  that  wj  is  approximately  equivalent  to  40  per  second,  or  6.4  cps, 
by  experimenting  on  the  cane;  in  this  case,  for  n =  1,  2,  3,  and  4, 

table  1 

NATURAL  FREQUENCIES  OF  THE  CANE:  LEFT  END  CONDITION  2 

n  12  3  4 

con,  sec-1  40  266  743  1450 

A  third  set  of  left  end  conditions  obtains  when  the  cane  is  held  tightly 
between  two  fingers.  The  cane  is  free  to  pivot  in  this  case,  but  we  neglect 
the  mass  of  the  cane  handle : 

A . ^ 

Figure  5  Left  End  Condition  3 


Y  is  equal  to  O, 

P  =  J-  -p-  (8) 

dx“  El  dt2  dx  K  ' 

at  x  =  O.  (/  is  the  effective  mass  moment  of  inertia  about  the  origin  of 
coordinates.)  Thus,  y  —  O,  d3y/2x>  =  O.  For  the  specific  cane  tested,  for 
the  values  of  n  —  1,  2,  3,  and  4,  we  obtain 

table  2 

NATURAL  FREQUENCIES  OF  THE  CANE:  LEFT  END  CONDITION  3 

n  12  3  4 

oon,  sec-1  0  175  568  11900 

Considering  no  support  and  a  weightless  handle,  and  for  the  tubing  of  this 
specific  cane,  we  obtain  these  values: 

table  3 

NATURAL  FREQUENCIES  OF  THE  CANE:  CONDITION  3, 

NO  SUPPORT,  WEIGHTLESS  HANDLE 

n  12  3  4 


wn, sec-1  0  254  700  1375 


17 


Information  Generation:  The  Cane 
(Handle  weight  can  be  accounted  for  by  letting 

d2y  J  d3y  cfy  _  £7  d*y 

to 5  El  dt 2  to  3n  3(2  n  a*5 

at  x  —  O.  N  —  handle  mass.) 

It  should  be  noted  that  this  model  implies  that  no  information  is  trans¬ 
mitted  to  the  user  of  the  cane. 

For  any  of  these  calculations  to  become  truly  useful,  it  would  be 
necessary  to  establish  a  set  of  boundary  conditions  which  account  ac¬ 
curately  for  the  hand  of  the  user.  From  empirical  evidence  we  know  that 
the  damping  due  to  the  user’s  hand  is  “greater”  than  the  internal  damping 
in  the  cane — to  such  a  degree  that  damping  in  the  cane  can  very  likely 
be  neglected  if  a  good  set  of  hand  conditions  could  be  established.  In  any 
case,  it  is  easy  to  see  that  these  calculations  very  quickly  become  very 
difficult,  even  for  the  linear  model  of  a  hand  we  used  for  the  first  set  of  left 
end  conditions. 

INFORMATION  ACQUISITION  WITH  THE  CANE 

How  is  the  cane  used  and  what  aspects  of  the  environment  can  be  measured 
with  it?  The  cane  itself  is  a  simple  mechanism,  usually  a  hollow  metal 
tube,  having  a  crook  at  one  end,  and  a  more  or  less  durable  tip  of  metal 
or  plastic  at  the  other.  In  probing  the  environment,  the  user  pushes  on  the 
cane,  and  the  cane  pushes  directly  back  on  the  man.  In  coordinating  his 
arm  motion  with  this  experience  of  constraint,  he  thus  begins  to  sample 
the  geometry  of  the  environment.  There  are  further  cues  arising  from  the 
impedance,  if  you  will,  to  the  pokes  or  taps  of  the  cane.  The  environment 
has  a  spring  characteristic;  it  may  have  a  viscous  characteristic;  it  may 
even  have  a  kind  of  inertial  characteristic.  Furthermore,  there  are  forced 
vibrations  transmitted  through  the  shaft  when  the  cane  is  dragged  along 
a  surface,  and  these  will  differ,  depending  on  the  material  of  the  surface. 

Tapping  the  cane  and  sliding  it  along  both  provide  tactile,  kinesthetic, 
and  vibratory  cues  which  are  transmitted  to  the  user  through  the  shaft  of 
the  cane.  In  addition,  there  are  auditory  signals  from  the  environment 
itself.  The  simply  structured  cane,  then,  provides  the  sentient,  intelligent 
user  with  several  kinds  of  information,  and  is  perhaps  best  considered  an 
extension  of  a  sense  organ.  The  question  is  how  a  man  can  best  use  the  cane 
to  gain  the  most  information  about  those  aspects  of  the  environment  in 
which  he  is  interested.  We  should  also  keep  in  mind  the  fundamental  query 
of  what  effect  motor  activity  have  on  sensation;  there  is  apparently  a  sig- 


1 8  Man-Machine  Systems 

nificant  difference  between  active  participation  in  a  sensory  situation  and 
passively  observing  that  same  environment.  These  and  other  questions  were 
explored  in  some  of  our  initial  experiments  on  information  acquisition 
using  the  cane  ( 3 ) . 

One  experiment  was  concerned  with  filtering  various  kinds  of  sensory 
information  selectively.  In  one  case,  hearing  was  masked;  in  another,  the 
subject  used  a  cane  in  which  a  universal  joint  had  been  inserted  so  as  to 
eliminate  certain  cues  about  the  geometry  of  the  object;  and  in  a  third  case 
the  vibrations  in  the  shaft  of  the  cane  were  damped  by  cutting  a  cane  in  half 
and  inserting  a  piece  of  rubber  tubing  between  the  two  parts.  Our  principal 
conclusion  from  all  this  was  that  subjects  can  be  quite  inventive,  and  adjust 
their  techniques  to  compensate  for  the  cues  that  are  missing. 

A  second  experiment  involved  measuring  the  vibratory  information 
from  the  cane  when  different  objects  and  materials  were  probed.  The  vi¬ 
bratory  patterns  differ,  and  also  damp  differently,  depending  on  the  ma¬ 
terial.  The  cane,  however,  has  a  natural  frequency  which  predominates,  and 
any  other  frequencies  in  the  vibration  are  superimposed  on  that  natural 
resonance.  In  this  way,  it  is  like  the  problem  of  distinguishing  between 
human  voices  which  have  more  or  less  the  same  formants  or  natural  reso¬ 
nances.  Obtaining  useful  and  measurable  differential  signals  from  the  vibra¬ 
tion  of  a  cane  proved  difficult,  and  conventional  strain  gage  bridge  tech¬ 
niques  did  not  appear  sensitive  enough  to  render  the  different  patterns 
meaningful.  Further  efforts  will  be  made  to  distinguish  among  the  char¬ 
acteristics  of  cane  vibration  when  different  materials  are  probed  or  tapped. 

A  third  experiment  involved  measurement  of  kinesthetic  discriminaton. 
Blindfolded  subjects’  canes  were  guided  into  touching  two  points,  one  to 
their  left  and  one  to  their  right;  the  points  were  24  inches  apart.  Then  they 
were  asked  to  bisect  this  distance,  and  touch  the  midpoint  with  the  cane. 
The  same  experiment  was  repeated,  but  with  the  variation  of  alignment 
of  the  two  points  along  the  depth  axis,  one  near  the  subject  and  one  further 
away.  In  each  case,  the  standard  deviation  of  the  errors  was  about  10  per¬ 
cent  of  the  total  sweep  distance. 

A  fourth  experiment  measured  the  cane  user’s  ability  to  discriminate 
among  various  mechanical  impedance  characteristics  on  the  basis  of  a 
transient  “probe”  or  poke  with  the  cane.  A  cantilever  spring  was  set  up  in 
such  a  way  that  the  blindfolded  subject  could  probe  the  same  place  twice 
in  rapid  succession.  There  were  three  parts  to  the  experiment.  In  the  first 
part,  the  force  at  the  stop  or  strain  limit  was  held  constant  and  the  dis¬ 
placement  to  the  stop  was  varied  from  the  first  to  the  second  probe.  The 


19 


Information  Generation:  The  Cane 

subject  stated  which  displacement  was  the  greater.  In  the  second  part,  the 
displacement  was  held  constant  and  the  force  level  was  varied.  The  subject 
judged  which  force  was  greater.  In  the  third  part,  the  stop  was  removed 
and  the  spring  constant  was  varied  from  the  first  to  the  second  probe,  and 
the  subject  could  push  as  hard  as  he  pleased.  He  was  required  then  to  state 
which  was  the  stiffer. 

Ordinarily  psychophysical  techniques  of  threshold  measurement  were 
employed  in  this  experiment.  The  “just  noticeable  differences”  (JND’s) 
were  obtained  by  introducing,  at  random,  increments  of  change  amounting 
to  plus  or  minus  2,  4,  8,  16,  or  32  percent  of  the  reference  level. 

The  JND  was  taken  as  the  level  at  which  subjects  reported  correct 
judgments  of  distance  in  75  percent  of  the  trials.  The  experiment  was  per¬ 
formed  at  different  reference  levels  of  displacement  and  spring  constant.  For 
compression  probing,  both  one-  and  ten-pound  net  forces  were  employed 
at  the  strain  limit;  0.1  pound  was  used  in  lateral  probing.  The  data  were 
summarized  as  JND  percent  versus  reference  level. 

One  may  conclude  from  these  data  that  the  cane  user  can  resolve  5  to 
10  percent  differences  in  displacement  in  either  displacement  distance  or  in 
the  force  required  for  compression  probing.  He  is  not  quite  as  good  in  dis¬ 
criminating  among  different  spring  constants.  At  an  axial  force  level  of  ten 
pounds — and  also  in  probing  laterally — the  cane  bends,  and  his  environ¬ 
mental  discriminations  are  impaired  by  the  elasticity  of  the  cane  itself.  (See 
Summary  Tables  in  Sheridan,  T.  B.  [4,  pp.  45-47].) 

USE  OF  AN  OBSTACLE  COURSE  IN 
EVALUATING  THE  MOBILITY  OF  THE  BLIND 

The  problem  in  this  study  (2)  was  to  design  a  representative  obstacle  course 
for  the  blind  and  to  develop  techniques  for  evaluating  their  behavior  in 
traversing  such  a  course.  An  effort  was  made  to  embody  the  most  salient 
features  of  the  environment  to  which  the  blind  traveler  interacts.  In  con¬ 
structing  the  course  the  following  characteristics  were  included:  bounded 
and  open  spaces;  type  and  distribution  of  the  obstacle  objects  in  the 
traveler’s  path;  step-ups  and  step-downs;  and  auditory  cues.  The  tech¬ 
niques  of  evaluation  were  simple  enough  that  inexperienced  observers  could 
administer  the  necessary  tests. 

Four  groups  were  used  in  the  tests.  The  first  group  included  five  blind 
subjects  who  traversed  the  obstacle  course  under  three  environmental  con¬ 
ditions  while  using  the  Hoover  long  cane  technique.  The  remaining  three 


20  Man-Machine  Systems 

groups  were  all  composed  of  sighted  but  blindfolded  subjects  who  had  had 
no  previous  experience  with  the  tests.  The  second  group  also  traversed 
the  course  using  the  Hoover  long  cane  technique.  The  third  group  used 
two  long  canes.  The  fourth  group  used  a  modified  cane  in  the  shape  of  a  T. 

Each  group  traversed  the  course  under  three  conditions  of  auditory 
cue  availability:  the  first  was  a  quiet  room;  this  progressed  into  an  area 
with  background  noise  produced  by  ventilating  fans;  and  finally  a  condition 
of  complete  auditory  masking.  The  experimental  measures  used  were 
number  of  taps,  total  traverse  time,  and  the  number  of  “harm  events”  (i.e., 
the  events  which  could  cause  harm  to  the  traveler  or  others  in  his  vicinity) 
observed  using  several  categories  of  definition  of  “harm.” 

The  results  of  an  analysis  of  variance  for  taps  and  for  time  suggest 
that  both  travel  experience  and  cane  configuration  affect  the  number  of 
taps  (and,  of  course,  of  time — which  is  proportional  to  taps  under  all  con¬ 
ditions).  Moderate  ambient  noise  does  not  increase  the  number  of  taps 
required;  complete  auditory  masking  does  increase  the  number  of  taps. 
Experienced  blind  travelers  had  fewer  harm  events,  as  a  group,  than  the 
normally  sighted.  The  number  of  harm  events  among  the  blind  travelers 
did  vary  considerably,  however,  from  person  to  person. 

Our  experiments  suggest  that  the  feasibility  of  such  an  experimental 
course,  composed  of  artificial  objects,  for  exploration  of  the  mobility  of 
the  blind,  has  been  demonstrated.  An  objective  “success  of  travel”  scale, 
based  on  the  “harm  events,”  and  derived  from  the  results  obtained,  will 
be  proposed  below. 


The  Obstacle  Course 

The  entire  obstacle  course  was  set  up  in  a  large  and  otherwise  empty  room; 
this  arrangement  provided  us  with  control  over  weather,  interruptions  by 
curious  pedestrians,  and  so  on.  The  course  replicated,  in  terms  of  model 
forms,  such  characteristics  of  the  environment  normally  traversed  by  the 
blind  person  as  bounded  paths,  open  streets,  solid  objects  at  ground  level, 
objects  protruding  from  levels  above  the  ground,  step-ups  and  step-downs, 
automobiles,  sound  reflectors,  and  stairways.  Most  of  the  model  forms 
were  constructed  of  wood.  They  were  arranged  in  such  a  way  as  to 
sample  some  characteristics  of  the  real  environment  (see  Figure  6).  The 
distance  from  the  starting  point  to  the  finishing  point  was  170  feet. 
Figures  7  through  12  show  several  views  of  the  obstacle  course. 


Hanging 

Sound  Reflector 


Figure  6  Layout  of  Obstacle  Course 


Figure  7  Avoiding  Sound  Reflector  Figure  8  Anticipating  Crack  in  Sidewalk 


Figure  1 1  Confirmation  of  Path 


Figure  12  Finding  the  Doorway 


Figure  10  A  Narrow  Passageway 


Figure  9  Successful  Street  Crossing 


Information  Generation:  The  Cane 


23 


Experimental  Procedure 

The  subject  first  entered  a  room  on  the  first  floor,  where  an  experimenter 
put  a  blindfold  on  the  subject  and  led  him  to  the  third  floor  where  the 
obstacle  course  was  located.  To  reduce  his  tension  and  to  give  him  some 
cane  practice,  the  subject  was  instructed  to  walk  a  section  of  the  room 
free  of  obstacles.  The  total  distance  of  this  “free  course”  was  170  feet — 
the  same  distance  as  that  of  the  obstacle  course.  The  time  spent  in  travers¬ 
ing  the  free  course  was  measured  with  the  stop  watch,  and  the  number 
of  taps  the  subject  made  with  his  cane  was  recorded  on  a  portable  tape 
recorder  carried  by  an  experimenter  walking  just  behind  and  to  the  side 
of  the  subject.  When  he  finished  the  free  course,  the  subject  was  led  to  the 
beginning  of  the  obstacle  course,  and  provided  with  the  following  instruc¬ 
tions  : 


“This  is  the  starting  point  of  the  obstacle  course.  To  your  right  there 
is  a  brick  wall.  You  have  to -walk  straight  until  you  reach  a  brick  wall 
opposite  you.  Then,  turn  left  and  follow  a  confined  path.  To  your 
right  there  will  be  a  brick  wall,  and  to  your  left  some  two-by-fours 
placed  on  the  cement  floor.  You  have  to  stay  within  the  confined 
path  until  you  reach  a  wooden  platform  ...  we  call  it  a  sidewalk. 
You  have  to  step  up  on  to  the  sidewalk,  and  walk  to  the  end  of  it. 
At  the  end  of  the  sidewalk,  you  have  to  step  down  on  to  the  street  .  .  . 
it  is  the  concrete  floor.  Keeping  a  straight  line,  you  have  to  cross  the 
street  and  find  a  second  sidewalk — a  wooden  platform.  You  have  to 
step  on  it  and  walk  to  the  end,  even  though  it  turns  sharply  to  the 
left  at  one  point.  Once  you  have  reached  the  end  of  the  second  sidewalk, 
you  have  to  step  down  on  to  a  concrete  floor.  The  path  you  have  to 
follow  is  bounded  by  sheets  of  acoustic  tile  placed  on  the  floor.  Walking 
along  this  path,  you  will  come  to  a  stairway.  Your  task  is  to  go  up 
and  go  down  the  stairway.  When  you  are  on  the  concrete  floor  again, 
follow  the  bounded  path  until  I  ask  you  to  stop.  Any  questions?” 

If  the  subject  asked  questions,  the  experimenter  answered  them.  Then  the 
following  resume  was  given  the  subject: 

“As  you  have  gathered,  the  course  is  roughly  circular,  in  the  sense 
that  the  finishing  and  starting  points  almost  coincide.  When  you  come 
up  to  an  object,  you  will  have  to  decide  how  to  navigate  further.” 

The  subject’s  name,  the  date  of  the  trial,  and  the  condition  of  travel 
were  recorded,  and  the  subject  started  to  walk  the  course.  For  purposes 
of  recording,  the  course  was  divided  into  eight  segments.  As  the  end  of 
a  segment  was  reached,  the  experimenter  noted  it  by  recording  a  coded 
number  on  the  tape;  he  also  noted  the  cumulative  time  up  to  that  point. 


24  Man-Machine  Systems 

Harm  events  were  recorded  as  they  occurred  during  the  course;  three 
types  of  event  were  noted:  bumping  into  an  object;  tripping  on  an  object; 
or  getting  the  cane  stuck.  Bumping  was  predicted  by  significant  body  contact. 
Tripping  was  defined  as  the  subject’s  momentarily  losing  his  balance  when 
his  feet  unexpectedly  collided  with  something  (no  one  was  hurt  as  a  result 
of  tripping).  Cane  sticking  was  defined  by  lodging  its  end  between  two 
objects  or  in  the  “cracks”  of  the  “sidewalk.” 

Each  subject  traversed  the  free  course  four  times;  he  also  traversed  the 
obstacle  course  three  times  under  each  of  the  three  different  conditions.  For 
the  first  condition  quiet  obtained;  for  the  second  condition  two  exhaust  fans 
were  turned  on  to  provide  a  background  noise  level  of  60  db  sound  pressure; 
for  the  third  condition  the  subject  wore  Willson  Sound  Barrier  earphones 
which  damped  external  sound  energy  and  gave  80  to  100  db  white  noise 
sound  pressure  from  a  modified  transistor  radio  to  mask  out  all  residual 
auditory  cues.  Sighted  subjects  were  not  allowed  to  see  the  course  until  they 
had  traversed  it  under  all  the  conditions.  These  subjects  were  then  allowed 
to  remove  blindfolds,  and  were  asked  to  traverse  the  course  once  again, 
using  the  canes.  Tape  recordings  were  made  for  this  trial  just  as  for  the 
earlier  trials. 

Three  variations  of  cane  plus  cane  technique  were  used.  The  first 
combination — which  was  used  for  the  majority  of  the  tests — was  the 
common  aluminum  long  cane  and  the  Hoover  technique.  The  second  com¬ 
bination  used  two  common  aluminum  canes;  the  second  cane  was  held  in 
the  normally  free  hand  and  manipulated  in  a  mirror  image  of  the  first 
cane — resulting  in  a  kind  of  crisscross  pattern.  The  third  combination  con¬ 
sisted  of  a  modified  common  cane  used  with  the  Hoover  technique.  The 
modification  consisted  of  a  12-inch  aluminum  bar  attached  perpendicularly 
to  the  tip  of  the  cane,  thus  providing  double  contact  points  with  the  en¬ 
vironment  on  the  ends  of  the  T  bar. 


Subjects 

The  five  blind  subjects  were  drawn  from  the  St.  Paul’s  Rehabilitation  Cen¬ 
ter  in  Newton,  Massachusetts,  and  were  all  in  their  last  week  (the  16th)  of 
training  in  the  use  of  the  cane.  The  sighted  subjects  were  MIT  students. 
The  retraining  problem  with  sighted  students  prompted  us  to  use  each 
group  of  sighted  subjects  with  only  one  cane  condition.  Four  each  partici¬ 
pated  in  the  one-cane  condition,  the  two-cane  condition,  and  two  subjects  in 
the  T-cane  condition. 


Information  Generation :  The  Cane  25 

Results 

As  mentioned  above,  the  tape  recorded  data  consisted  of  cane  taps,  cumu¬ 
lative  time  in  seconds,  and  harm  events,  while  traversing  the  free  course 
and  while  traversing  the  obstacle  course.  Table  4  shows  the  arithmetic 


table  4 

MEANS  OF  CANE  TAPS,  TIME  SPENT  TRAVERSING  THE  COURSE, 
AND  HARM  EVENTS  FOR  FOUR  GROUPS  OF  SUBJECTS 


Free 

Obstacle  Course,  170  feetft 

Measure- 

Groups 

walking! 

Earphone 

ments 

of 

course 

Quiet 

White 

No 

Subjects 

170  feet 

Room 

Fans 

Noise 

Blindfold 

5  blind 

104 

268 

193 

243 

— 

1  cane 

4  sighted 

117 

228 

222 

227 

85 

Means  of 

1  cane 

Cane  Taps 

4  sighted 

2  canes 

104 

280 

238 

237 

65 

2  sighted 

93 

361 

284 

265 

107 

“T”  cane 

418 

1137 

937 

972 

257 

5  blind 

1  cane 

63 

258 

179 

223 

— 

Mean  of 

4  sighted 

76 

205 

209 

205 

56 

Time  in 

1  cane 

Seconds 

4  sighted 

2  canes 

71 

262 

201 

182 

54 

2  sighted 
“T”  cane 

115 

344 

245 

210 

68 

5  blind 

1  cane 

— 

7 

7 

19 

— 

Means  of 

4  sighted 

— 

21 

21 

23 

— 

Harm 

1  cane 

Events 

4  sighted 

2  canes 

18 

15 

15 

— 

2  sighted 
“r*  canes 

121 

121 

17 

f  Means  based  on  four  walks  of  170  feet  segments  for  every  subject, 
ft  Means  based  on  three  replications  of  each  condition  for  every  subject. 


means  of  cane  taps,  time  in  seconds,  and  harm  events,  for  four  groups  of 
subjects  under  all  conditions  and  for  170-foot  segments  of  the  free  and  the 
obstacle  courses.  Tables  5  through  8  show  the  number  of  taps,  time  in 


26 


Man-Machine  Systems 


TABLE  5 

BLIND  SUBJECTS  USING  ONE  CANE:  RESULTS  FOR  EACH  TRIAL 
OF  CANE  TAPS,  TIME  AND  FREQUENCY  OF  HARM  EVENTS 


Free  walk  Obstacle  Course 

Subjects  340  feet  Quiet  room  Fans  White  noise 


taps 

150 

152 

153 

147 

139 

139 

119 

118 

256 

303 

265 

J.H. 

time 

124 

123 

114 

102 

109 

169 

98 

98 

263 

249 

215 

harm 

— 

— 

— 

— 

— 

2 

1 

1 

8 

/ 9 

6 

taps 

287 

230 

449 

353 

266 

243 

247 

347 

256 

273 

L.B.f 

time 

248 

176 

487 

413 

241 

245 

245 

336 

280 

301 

harm 

— 

— 

2 

2 

5 

1 

2 

7 

3 

7 

taps 

232 

216 

343 

275 

240 

211 

201 

192 

232 

234 

233 

F.R.f 

time 

145 

134 

329 

219 

199 

205 

174 

162 

212 

193 

198 

harm 

— 

— 

3 

1 

4 

1 

2 

— 

5- 

7 

3 

taps 

261 

198 

415 

273 

231 

253 

182 

171 

193 

193 

171 

M.A.f 

time 

211 

144 

406 

259 

199 

247 

195 

167 

208 

190 

156 

harm 

— 

— 

3 

2 

5 

5 

4 

5 

6 

8 

6 

taps 

178 

299 

195 

179 

195 

188 

177 

280 

219 

192 

E.W. 

time 

131 

214 

167 

210 

169 

146 

133 

227 

173 

147 

harm 

— 

4 

5 

6 

4 

2 

2 

10 

4 

6 

t  Female  Subject 


TABLE  6 

SIGHTED-BLINDFOLDED  SUBJECTS  USING  ONE  CANE:  RESULTS  FOR 
EACH  TRIAL  OF  CANE  TAPS,  TIME,  AND  FREQUENCY  OF  HARM  EVENTS 


Obstacle 

Sub-  Free  walk  Course  No 

jects  340  feet  Quiet  room  Fans  White  noise  blindfolds 


taps 

189 

182 

285 

268 

262 

221 

229 

218 

253 

237 

196 

78 

JD 

time 

174 

137 

337 

265 

232 

190 

195 

190 

213 

197 

171 

51 

harm 

— 

— 

8 

6 

5 

6 

7 

2 

10 

8 

4 

— 

taps 

209 

225 

205 

174 

182 

187 

222 

202 

222  205 

215 

— 

JT 

time 

110 

125 

186 

102 

104 

184 

150 

127 

138 

130 

124 

— 

harm 

— 

— 

4 

2 

4 

2 

3 

1 

1 

1 

2 

— 

taps 

237 

— 

243 

269 

201 

238 

226 

269 

269 

241 

235 

96 

ENt 

time 

175 

— 

283 

253 

215 

298 

273 

338 

283 

297 

228 

75 

harm 

— 

— 

6 

10 

7 

11 

12 

16 

13 

6 

5 

— 

taps 

228 

201 

227 

223 

195 

236 

209 

203 

245 

190 

197 

82 

MMt 

time 

162 

159 

164 

194 

168 

206 

177 

175 

168 

148 

165 

42 

harm 

— 

— 

11 

8 

12 

7 

5 

12 

6 

9 

12 

— 

|  Female  Subject 


27 


Information  Generation:  The  Cane 


table  7 

SIGHTED-BLINDFOLDED  SUBJECTS  USING  TWO  CANES:  RESULTS 
FOR  EACH  TRIAL  OF  CANE  TAPS,  TIME,  AND 
FREQUENCY  OF  HARM  EVENTS 


Obstacle 

Sub-  Free  walk  Course  No 

jects  340  feet  Quiet  room  Fans  White  noise  blindfolds 


taps 

243 

247 

272 

236 

318 

254 

265 

309 

316 

343 

103 

RC 

time 

217 

323 

267 

244 

301 

231 

298 

244 

234 

244 

85 

harm 

— 

4 

4 

6 

7 

8 

3 

2 

3 

4 

— 

taps 

169 

184 

215 

326 

289 

206 

186 

170 

228 

207 

213 

49 

JH 

time 

126 

141 

152 

248 

194 

148 

138 

140 

158 

149 

150 

46 

harm 

— 

— 

6 

5 

2 

6 

5 

6 

5 

11 

9 

— 

taps 

256 

175 

337 

321 

265 

226 

228 

178 

232 

213 

180 

66 

BK 

time 

233 

157 

377 

314 

246 

218 

215 

191 

189 

175 

156 

51 

harm 

— 

— 

8 

12 

7 

5 

2 

3 

6 

2 

2 

— 

taps 

160 

229 

400 

284 

281 

220 

217 

180 

218 

212 

178 

42 

JC 

time 

156 

136 

385 

259 

244 

192 

187 

158 

173 

173 

145 

35 

harm 

— 

— 

9 

5 

6 

6 

8 

3 

8 

3 

5 

— 

TABLE  8 

SIGHTED-BLINDFOLDED  SUBJECTS  USING  “T”  CANE:  RESULTS 
FOR  EACH  TRIAL  OF  CANE  TAPS,  TIME,  AND 
FREQUENCY  OF  HARM  EVENTS 


Sub¬ 

jects 


Free  Obstacle  Course 

walk  No 

340  feet  Quiet  room  Fans  White  noise  blindfolds 


taps 

201 

210 

478 

309 

348 

309 

307 

277 

242 

253 

238 

131 

time 

291 

165 

409 

230 

245 

213 

227 

179 

172 

165 

141 

76 

harm 

— 

— 

3 

2 

2 

3 

3 

2 

2 

4 

1 

— 

taps 

273 

251 

351 

303 

379 

329 

252 

225 

303 

297 

259 

83 

time 

316 

149 

432 

368 

447 

369 

263 

221 

312 

254 

221 

60 

harm 

— 

— 

4 

5 

9 

7 

5 

3 

8 

15 

5 

— 

28 


Man-Machine  Systems 

seconds,  and  frequency  of  harm  events  for  each  traversal  of  the  two  courses. 
Table  9  shows  the  frequency  of  three  kinds  of  harm  events  mentioned. 


table  9 


FREQUENCY  OF  HARM  EVENTS  FOR  INDIVIDUAL  SUBJECTS 


Subjects 

Bumping  into 
Obstacles 

Cane 

getting  Stuck 

Tripping  over 
Obstacles 

Blind 

(one  cane) 

FR| 

18 

— 

8 

MAf 

26 

1 

17 

LBf 

16 

3 

10 

JH 

12 

— 

15 

EW 

28 

1 

14 

Sighted 
(one  cane) 

JD 

33 

3 

20 

ENf 

44 

9 

33 

MMf 

30 

8 

44 

JT 

12 

5 

3 

Sighted 
(two  canes) 

RC 

21 

4 

16 

JH 

23 

9 

23 

BK 

24 

10 

13 

JC 

28 

9 

16 

Sighted 

(T  cane) 

MC 

12 

7 

3 

JH 

35 

4 

22 

t  Female  Subject 


By  plotting  the  number  of  cane  taps  as  a  function  of  trial  time,  a  rea¬ 
sonably  straight  regression  line  was  obtained.  Figures  13  through  16  show 
the  regression  lines  for  each  group  of  subjects. 


Taps  Taps 


Information  Generation :  The  Cane 


29 


9  (406, 415) 


Figure  1 3  Blind  Subject  Using  One  Cane.  Cane  taps  as  a  function  of  time 
for  each  trial. 


Figure  14  Sighted  But  Blindfolded  Subjects  Using  One  Cane.  Cane  taps  as 
a  function  of  time  for  each  trial. 


Taps  Taps 


30 


Man-Machine  Systems 


Figure  15  Sighted  But  Blindfolded  Subjects  Using  Two  Canes.  Cane  taps 
as  a  function  of  time  for  each  trial. 


Figure  16  Sighted  But  Blindfolded  Subjects  Using  T  Cane.  Cane  taps  as 
a  function  of  time  for  each  trial. 


Information  Generation :  The  Cane 


31 


Discussion  and  Analysis 

An  analysis  of  variance  for  number  of  taps,  groups  of  subjects,  and  con¬ 
ditions  of  the  test  revealed  significant  differences  among  groups  of  subjects 
and  conditions  of  tests,  both  at  the  95  percent  confidence  level.  The 
analysis  was  done  on  the  basis  of  a  3  by  5  table;  the  three  rows  were 
labelled  “quiet  room,”  “fans,”  and  “white  noise,”  and  the  columns  were 
labelled  “order  of  assignment,”  “five  blind  subjects,  one  cane,”  “four 
blindfolded  subjects,  one  cane,”  “four  blindfolded  subjects,  two  canes,” 
and  “two  blindfolded  subjects,  T  cane.”  The  F  ratios  for  groups  of  sub¬ 
jects  and  conditions  of  the  test  were  3.91  and  3.89,  respectively,  and  both 
were  significant  at  P  >  0.05,  as  indicated  already. 

A  second  analysis  of  variance,  this  time  for  total  traverse  time,  was  set 
up  in  a  manner  similar  to  that  for  the  analysis  of  number  of  taps,  with 
respect  to  rows  and  columns.  The  analysis  showed  that  the  F  ratio  for 
conditions  of  the  tests  was  3.06;  we  can  say  at  the  95  percent  level  of  con¬ 
fidence  that  this  source  of  the  variance  is  statistically  significant.  This 
source  was  the  only  significant  variable  in  the  analysis. 

The  significance  of  the  contribution  to  the  variance  of  the  factor  of 
“groups  of  subjects,”  which  showed  an  F  ratio  of  1.12,  is  a  rough  indica¬ 
tion  at  best,  for  it  can  be  seen  from  the  experimental  design  that  the  differ¬ 
ences  between  the  blind  and  sighted  subjects,  and  between  cane  treatments 
within  the  sighted  but  blindfolded  category,  were  confounded.  A  glance 
at  Table  4  will  indicate  that,  for  both  the  number  of  taps  and  for  total 
traversal  time,  the  differences  between  blind  and  sighted  subjects  appear 
about  as  large  as  the  differences  between  cane  treatments  within  the  category 
of  sighted  subjects. 

The  same  may  be  said  for  the  “conditions”  effect,  which  was  significant 
with  respect  to  taps  of  the  cane.  The  “conditions”  effect  per  se  was  con¬ 
founded  with  order.  Again,  by  referring  to  Table  4  we  can  see  that  both 
number  of  taps  and  total  traversal  time  tend  to  decrease  from  the  “quiet 
room”  to  the  “fans”  conditions — suggesting  either  an  experience  improve¬ 
ment  which  was  not  offset  by  a  performance  decrement  due  to  lost  auditory 
cues,  or  that  a  small  amount  of  ambient  noise  was  actually  useful  and 
fewer  taps  were  needed!  Both  number  of  taps  and  traversal  time  increased 
again  for  the  “white  noise”  condition,  indicating  a  tendency  towards  more 
taps  and  a  slower  pace  in  spite  of  any  experience  or  other  order  effect. 

Table  4  is  in  fact  more  revealing  than  this.  If  we  compare,  for  the  free 
course,  the  number  of  cane  taps  and  the  traversal  time,  with  the  taps  and 
time  for  the  obstacle  course,  some  large  but  not  quite  unexpected  differ- 


32  Man-Machine  Systems 

ences  appear.  All  of  the  subjects  show  a  minimum  doubling,  both  in  the 
average  number  of  taps  and  in  the  average  time  of  traversal  for  the  170-foot 
course.  We  might  infer  from  this  change  in  behavior  that  there  are  two 
different  uses  of  the  cane.  In  the  free  course  the  cane  might  have  been  used 
to  confirm  a  strong  hypothesis  or  expectation  on  the  subject’s  part  that  the 
future  environment  was  to  be  like  that  which  was  just  encountered.  In  the 
obstacle  course,  on  the  other  hand,  the  cane  might  have  been  used  as  a 
probe  in  order  to  find  an  open  passageway.  In  the  latter  case,  the  prior 
environment  differed  from  the  future  environment,  and  segments  of  the 
path  which  were  both  small  and  independent  of  one  another  had  to  be  ex¬ 
amined  in  detail  before  the  subject  could  venture  forward.  In  other  words, 
both  more  taps  and  more  time  are  needed  to  explore  the  environment  in 
detail. 

Table  4  also  indicates  some  interesting  behavioral  differences  between 
groups  of  subjects.  The  group  of  blind  subjects  seemed  to  be  more  sensitive 
to  environmental  changes  than  the  sighted  but  blindfolded  subjects  using 
one  cane.  But  we  must  remember  that  this  sensitivity  to  conditions  may 
have  arisen  from  the  acoustic  masking  conditions  themselves,  or  to  the 
ordering  of  conditions,  the  same  for  every  group.  The  only  group  showing 
a  consistent  improvement  through  all  conditions  was  that  of  the  sighted 
but  blindfolded  subjects  using  the  T  cane.  The  sighted  but  blindfolded  sub¬ 
jects  using  one  cane  remained  at  a  stable  plateau  of  tapping  rate  throughout 
all  the  conditions  of  test. 

From  the  measurements  taken  of  number  of  taps  and  of  total  traversal 
time,  therefore,  we  can  conclude  only  that  perhaps  time  and  number  of 
taps  may  be  related  throughout  all  the  treatments  (see  Figures  13  through 
16). 

Table  4  also  shows  that  the  measure  of  “harm  events”  differentiates  the 
sighted  from  the  blind  subject  groups.  A  Chi-square  test  for  harm  events 
among  the  four  groups  of  subjects  was  significant  at  the  0.05  level  (Chi- 
square  was  8.76  with  3  degrees  of  freedom).  The  differences  in  harm  events 
within  the  sighted  but  blindfolded  groups  of  subjects  suggest  that  the  cane 
provided  different  information  to  the  user  about  the  environment. 

Observations  made  during  the  experimental  session  by  the  experi¬ 
menters,  combined  with  some  questioning  of  the  subjects,  revealed  that  the 
blind  and  the  sighted  subjects  used  different  criteria  for  detecting  and 
avoiding  obstacles.  The  blind  subjects  used  acoustic  cues  to  detect  hanging 
obstacles  (they  were  able  to  detect  the  hanging  platform  at  the  end  of  six 
trials).  When  these  cues  were  blocked  by  white  noise,  the  blind  subjects 


33 


Information  Generation :  The  Cane 

could  not  avoid  bumping  into  hanging  obstacles.  On  the  other  hand,  the 
sighted  but  blindfolded  subjects  used  such  acoustic  cues  but  rarely,  or  in¬ 
deed  were  unaware  of  their  existence;  they  evidently  preferred  to  estimate 
the  distance  from  already  encountered  obstacles  to  the  suspended  platform. 

The  group  which  used  two  canes  displayed  an  entirely  different  pattern 
of  cane  use  from  all  others.  One  of  the  canes — usually  that  in  the  nondomi¬ 
nant  hand — was  reserved  for  maintaining  continuous  contact  into  the  en¬ 
vironment,  and  was  dragged  along  the  floor;  the  second  cane  was  tapped  in 
the  conventional  way. 

Incidentally,  a  Chi-square  analysis  of  harm  events  for  the  individual 
subjects  in  the  group  of  blind  students  was  made,  with  a  significant  result 
(Chi-square  was  9.32,  with  4  degrees  of  freedom) . 

It  is  our  conclusion  from  this  study  that  even  with  a  very  limited  vari¬ 
ety  of  characteristics  with  which  we  can  imitate  the  features  of  the  real 
environment  of  the  traveler,  an  examination  can  be  made  of  the  factors 
of  experience  and  availability  of  cues  with  profit  in  the  mobility  of  blind 
travelers  using  the  cane. 

Although  a  15-point  scale  has  been  suggested  for  rating  the  mobility 
capability  of  blind  travelers  in  the  past  (5),  the  successful  observation  of 
cane  tapping,  sensitivity  to  the  environment,  and  the  occurrence  of  “harm 
events”  in  this  study  that  expansion  of  the  scale  is  possible.  A  scale  of  ob¬ 
jective  “success  of  travel”  could  be  used  to  augument  the  rather  more  sub¬ 
jective  mobility  rating  scale  already  proposed.  No  claim  is  made,  however, 
that  the  present  study  provides  enough  data  for  actually  constructing  such 
a  “success  of  travel”  scale,  nor  does  it  provide  indices  with  which  we  could 
measure  sufficiently  well  the  cane  user’s  ability  to  travel.  We  do  suggest 
that  further  experimentation  is  in  order  to  refine  the  categories  and  re¬ 
cording  conventions  for  “harm  events.” 


REFERENCES 

1.  DeFazio,  T.  L.,  and  T.  B.  Sheridan,  “Vibration  Analysis  of  the  Cane,”  Evaluation 

Report  on  Work  in  Progress  on  Sensory  Aids  and  Prosthetics.  Cambridge, 
Massachusetts:  Massachusetts  Institute  of  Technology,  1962.  (Report  No.  8768-3, 
Department  of  Mechanical  Engineering.)  Pp.  61-67. 

2.  Mickunas,  J.,  Jr.,  and  T.  B.  Sheridan,  “Use  of  an  Obstacle  Course  in  Evaluating 

Mobility  of  the  Blind,”  Evaluation  Report  on  Work  in  Progress  on  Sensory 
Aids  and  Prosthetics.  Cambridge,  Massachusetts:  Massachusetts  Institute  of 
Technology,  1962.  (Report  No.  8768-3,  Department  of  Mechanical  Engineer¬ 
ing.)  Pp.  68-88. 


34  Man-Machine  Systems 

3.  Potash,  Leonard.  “Correlates  of  the  Tactual  and  Kinesthetic  Stimuli  in  the  Blind 

Man’s  Cane.”  Unpublished  Master’s  thesis,  Massachusetts  Institute  of  Tech¬ 
nology,  1961. 

4.  Sheridan,  Thomas  B.,  “Engineering  Analysis  of  Cane  Information  Acquisition,” 

in  J.  W.  Linsner  (ed.)  Proceedings  of  the  Mobility  Research  Conference.  New 
York:  American  Foundation  for  the  Blind,  1962,  pp.  42-47. 

5.  Winer,  David,  “A  Test  and  Interview  Battery  for  Blind  Travelers,”  in  J.  W. 

Linsner  (ed.)  Proceedings  of  the  Mobility  Research  Conference.  New  York: 
American  Foundation  for  the  Blind,  1962,  pp.  64-75. 


ACTIVE  ENERGY  RADIATING  SYSTEMS: 


THE  BAT  AND  ULTRASONIC  PRINCIPLES  I* 

J. DAVID  PYE 

Institute  of  Laryngology  and  Otology,  London ,  England 


In  man  the  two  most  developed  ‘senses-at-a-distance’  are  vision  and  hear¬ 
ing.  When  one  of  these  is  deficient  it  seems  natural  to  attempt  to  achieve 
some  compensation  with  the  other,  as  in  lip  reading  for  the  deaf.  In  the 
past,  several  obstacle  detecting  devices  have  been  tried  for  the  blind,  but 
none  has  been  completely  acceptable  because  insufficiently  detailed  in¬ 
formation  was  presented  to  the  user  about  his  surroundings. 

Audible  systems,  using  the  unaided  ear  as  receiver,  present  several  draw¬ 
backs  and  also  introduce  undesirable  conspicuity.  The  eye  and  the  ear 
handle  incoming  energy  in  quite  different  ways.  Even  a  single  eye  can  form 
a  spatial  concept  of  its  surroundings  by  the  formation  of  an  image  and  by 
variation  of  focus.  But  the  cochlea  receives  only  energy  transmitted  to  it 
along  the  ossicle  chain  from  a  ‘point  receiver,’  and  interaction  of  the  two 
ears  is  necessary  to  achieve  accurate  and  immediate  localization.  This  inter¬ 
action  involves  comparison  of  the  time  of  arrival  of  sounds  at  each  ear 
(and  at  lower  frequencies,  the  phase  differences)  coupled  with  differences 
in  intensity  due  to  the  directional  receiving  properties  of  the  external  ears. 
In  some  way  the  brain  compares  signals  from  the  two  acoustic  nerves  and 
computes  the  direction  of  the  sound  source. 

It  is  doubtful  whether  this  mechanism  is  capable  of  locating  two  or 
more  sources  simultaneously,  especially  in  the  case  of  reflected  echoes, 
since  the  nature  of  the  sound  is  then  the  same  for  all  sources.  Also  the 
sensitivity  of  the  ear  is  suppressed  for  a  short  time  following  the  produc¬ 
tion  of  a  brief  intense  signal  by  the  transmitter.  It  is  for  this  reason  that 
echoes  from  short  distances  are  seldom  noticed.  Griffin  (3)  has  drawn 
attention  to  a  very  convincing  demonstration  of  this  effect.  Clapping  the 


*  The  research  reported  on  here  has  been  sponsored  by  the  Air  Force  Office  of 
Scientific  Research  (OAR),  through  the  European  Office,  Aerospace  Research,  United 
States  Air  Force. 


35 


36  Man-Machine  Systems 

hands  in  an  ordinary  room  produces  a  very  large  number  of  echoes,  but 
they  evoke  little  sensation.  The  effect  is  much  the  same  when  the  sound 
is  recorded.  If  the  same  recording  is  reversed  and  played  backwards  at 
the  same  speed,  the  multiple  echoes  are  heard  in  increasing,  instead  of 
diminishing,  order  of  intensity  and  therefore  become  much  more  obvious. 

While  some  skill  can  be  achieved  by  certain  individuals  in  the  use  of 
clicks  for  orientation,  much  improvement  is  desirable  to  permit  more 
general  application. 

Ultrasonic  devices  have  also  been  used,  usually  with  a  heterodyne  sys¬ 
tem  to  make  the  echoes  audible.  These  should  theoretically  give  improved 
acuity  due  to  the  shorter  wavelengths  used,  but  most  of  the  echo  informa¬ 
tion  is  not  conveyed  in  a  form  acceptable  for  easy  interpretation.  A  further 
possibility,  the  use  of  ambient  noise,  can  be  ruled  out  in  favor  of  the  in¬ 
dependence  inherent  in  an  active  system. 

It  is  fortunate  to  find  that  several  of  vertebrate  animals  can  orientate 
successfully  by  producing  sounds  usually  inaudible  to  us  but  which  they  can 
hear,  and  it  is  pertinent  to  inquire  how  they  overcome  the  problems  just 
described.  Griffin  and  his  colleagues  (2)  have  amply  demonstrated  the 
amazing  sensitivity,  accuracy,  and  speed  displayed  by  the  bat,  but  the  way 
in  which  these  are  achieved  is  still  obscure.  Examination  of  the  available 
evidence,  however,  has  suggested  an  hypothesis  which  may  have  direct 
application  in  the  design  of  a  mobility  aid  for  the  blind. 

It  has  been  reported  by  Mohres  (9)  and  Dijkgraaf  (1)  that  the  two 
families  of  European  bats  discriminate  most  accurately  over  those  ranges  for 
which  the  echo  returns  before  pulse  production  is  complete.  Since  the  pulse 
duration  is  very  different  in  the  two  groups,  this  finding  seems  to  be  of 
great  significance.  Both  appear  to  respond  to  objects  and  prey  down  to  a 
range  of  only  a  very  few  centimeters,  and  the  maximum  range  is  greater 
for  bats  with  longer  pulses.  This  suggests  that  the  emitted  sound  interacts, 
instead  of  masking,  close  range  echoes  and  forms  a  reference  signal  to 
achieve  a  transformation  of  the  echo-information.  The  mode  of  action  then 
depends  on  the  design  of  the  bat  pulses,  which  appear  to  be  ideal  for  two 
distinct  possibilities. 

Consider  first  the  V espertilionidae  whose  appearance  is  typified  by  the 
species  Pipisrrellus  pipisrrellus  (see  Figure  1).  The  muzzle  is  simple  and 
unadorned,  and  orientation  cries  are  emitted  through  the  open  mouth.  The 
ears  are  large,  with  a  tragus  of  unknown  function,  and  are  immobile  on  the 
head.  The  pulses  of  the  Vespertilionidae  are  very  short  in  duration  and 
composed  of  a  pure  tone,  free  of  harmonics,  whose  frequency  falls  steadily 


37 


Bat  and  Ultrasonic  Principles  I 


Figure  1  The  Head  of  a  Vespertilionid  Bat,  PipisrreJlus  pipisrrellus  (x5) 

throughout  (see  Figure  2).  When  recorded  on  tape  and  replayed  at  1/32  of 
the  original  speed,  the  cries  become  audible  and  sound  like  chirps.  At  normal 
speed  the  bat  produces  such  signals  at  rates  from  10  to  over  100  per  second. 
Because  of  the  frequency  change,  an  echo  whose  front  returns  during  pulse 
production  will  be  of  higher  frequency  than  the  outgoing  sound  at  all  times 
(see  Figure  3).  If  the  rate  of  frequency  change  is  constant,  i.e.,  the  fre¬ 
quency  modulation  is  linear,  the  difference  in  frequency  will  be  constant 
and  proportional  to  the  echo  delay  and  therefore  to  the  distance  of  the  re¬ 
flecting  object.  By  obtaining  a  multiplicative  term  with  a  nonlinear  detector 
the  difference  in  frequency  may  be  produced  as  a  separate  note,  the  differ¬ 
ence  or  beat  note. 

Thus  if  the  bat’s  ear  involves  a  nonlinear  process,  it  need  not  hear  the 
true  echo  at  all,  and  the  cochlea  can  ‘read’  the  range  of  the  object  from 
the  pitch  of  the  beat  note.  As  the  cochlea  acts  partially  as  a  Fourier 


FREQUENCY 


38 


Man-Machine  Systems 


Figure  2  An  Oscillograph  Record  of  a  Pulse  of  a  Vespertilionid  Bat 
(Nyctalus  noctula)  with  Time  Bars  of  100  msec  (Reads  Left  to  Right) 


Figure  3  The  Formation  of  a  Beat-Note  from  a  Frequency  Modulated 
Pulse  and  Echo 


39 


Bat  and  Ultrasonic  Principles  I 

analyzer,  multiple  objects  can  be  observed  simultaneously.  This  feature 
is  difficult  to  envisage  if  the  animal  works  directly  on  the  multiple  echoes, 
since  the  acoustic  nerve  cannot  transmit  sufficient  information  to  the  brain 
in  the  same  frequency  band  in  the  time  available.  Since  the  bat  pulse 
seldom  sweeps  more  than  one  octave  in  frequency,  the  beat  note  will  al¬ 
ways  lie  below  the  band  of  emitted  frequencies,  further  avoiding  the  prob¬ 
lem  of  masking.  The  mechanism  is  also  insensitive  to  deception  by  the 
signals  of  other  bats  flying  nearby. 

A  deviation  from  the  linearity  of  frequency  modulation  would  lead 
to  changes  in  frequency  during  each  beat  note,  although  the  form  of  the 
note  would  always  be  characteristic  of  the  echo  delay.  But  the  pulse  sweep 
of  many  Vespertilionids  is  remarkably  close  to  the  linear  condition.  Theo¬ 
retically,  further  information  can  be  obtained  if  both  ears  are  used  to  de¬ 
tect  beat  notes,  for,  although  the  interaural  distance  of  the  bat  is  small,  the 
ranging  mechanism  would  be  sufficiently  sensitive  to  give  measurably  differ¬ 
ent  beat  notes  in  each  ear  for  reflecting  objects  not  in  the  sagittal  plane. 
Since  each  note  gives  the  range  from  that  ear,  any  two  notes  will  ‘place’ 
the  object  at  the  conjunction  of  two  spherical  loci.  This  forms  a  circular 
locus  to  one  side  with  its  axis  passing  through  the  two  ears  (see  Figure  4). 
Slight  lateral  tilting  of  the  head  to  construct  a  second  circle  will  then  place 
the  object  unambiguously  (Figure  5 ) . 

In  favor  of  this  theory,  it  is  known  that  these  bats  produce  their  sounds 
through  a  fairly  wide  angle;  that  their  ears  show  little  directionality  as 
sound  receivers;  and  that  both  ears  must  be  used.  A  Vespertilionid  with 
one  ear  plugged  tends  to  fly  round  clumsily  in  circles.  No  beat  note  detector 
has  yet  been  found  in  the  bat’s  ear,  but  the  point  of  interest  here  is  that  a 
frequency  modulated  signal  can  easily  be  made  to  present  detailed  infor¬ 
mation  in  a  form  acceptable  to  the  cohlea. 

Turning  to  the  other  European  family,  the  Rhinolophidae ,  it  is  seen 
that  the  face  is  very  different  in  appearance  (Figure  6).  Sound  is  emitted 
through  the  nostrils  which  are  a  half-wavelength  apart  (9)  and  sur¬ 
rounded  by  a  highly  sculptured  structure  called  the  nose-leaf.  The  ear  has 
no  tragus  but  a  flap-like  antitragus  across  the  base  of  its  aperture,  and  again 
in  contrast  to  the  Vespertilionidae  the  ears  may  be  turned  rapidly  through 
a  wide  angle  on  the  head.  The  pulse  duration  is  about  10  times  as  great  as 
in  the  V espertiliondae ,  i.e.,  from  10  to  100  msec  and  the  frequency  is  ex¬ 
tremely  constant,  except  for  a  fall  at  the  end  of  each  pulse  (see  Figure  7). 
Recordings  slowed  down  by  32  times  sound  like  steam  whistles.  About  five 
to  ten  pulses  are  produced  each  second  at  normal  speed.  Possibly  the 


40 


Man-Machine  Systems 


Figure  4  Range  Measurer  from  Two  Points  gives 
a  Circular  ‘Locus  of  Ambiguity’ 


Figure  5  Tilting  the  Head  can  Remove  Ambiguity 


Figure  6  The  Head  of  a  Rhinolophid  Bat,  Rhinolophus  ferrumequinum  (x3) 


Figure  7  An  Oscillograph  Record  of  the  Ends  of  a  Pulse  of  Rhinolophus 
ferrumequinum  with  Time  Bars  of  100  msec  (Reads  Left  to  Right).  Sixty 
msec  of  continuous  signal  has  been  removed  from  the  gap. 


42 


Man-Machine  Systems 

terminal  phase  is  used  for  ranging,  but  the  long  constant  frequency  presents 
new  possibilities.  If  the  bat  and  object  are  moving  relatively  to  one  another, 
as  must  happen  constantly  in  flight,  the  received  echo  will  be  changed  in 
frequency  due  to  Doppler  shifts  (see  Figure  8).  Thus  beat  notes  between 
the  two  sounds  will  have  a  frequency  proportional  not  to  the  range,  but 

j  i 

to  the  time  differential  of  range,  the  relative  velocity.  The  magnitude  of 
this  shift  is  shown  in  Figure  9.  A  100  kc/sec  note  typical  of  some  species, 
after  reflection  from  a  target  approaching  at  10  meters  per  second,  will 
be  received  at  106.25  kc/sec,  and  will  produce  a  beat  note  of  6.25  kc/sec. 
Incidentally,  the  effect  of  Doppler  shifts  on  the  frequency  modulation 
ranging  system  is  negligible  due  to  the  speed  and  depth  of  the  modulation. 


A 


o 


UJ 

Z) 

O 

ll) 

CL 


1 


Doppler  shift 


ECHO 


CALL 


BEAT  NOTE 


T  I  ME  - > 

Figure  8  The  Construction  of  a  Beat-Note  from  a  Pulse  of  Constant 
Frequency 


Since  the  emitted  sound  is  beamed  to  some  extent  by  the  nose-leaf  and 
by  interference  at  the  nostrils,  Mbhres  (9)  has  suggested  that  the  ears  are 
also  directional  (the  direction  of  targets  could  then  be  found  by  beam  and 
ear  movements,  and  the  range  determined  by  triangulation).  But  the  ears 
are  too  small  to  be  highly  directional  sound  collectors,  for  they  measure 
only  a  few  wavelengths  across.  The  Doppler  shift  mechanism  offers  an  im¬ 
provement  in  this  direction.  Superimposed  on  the  quick  and  independent 
searching  movements  of  each  ear,  there  are  extremely  rapid  flickering 
movements  in  which  the  ears  are  180  degrees  out  of  phase.  These  would 
impose  additional  Doppler  shifts  on  the  echoes  which  would  be  proportional 


43 


Bat  and  Ultrasonic  Principles  I 

to  the  component  of  movement  in  the  direction  of  the  target  (see  Figure 
10).  Directional  sensitivity  obtained  in  this  way  is  quite  independent  of  the 
size  of  the  ear  itself. 

The  Doppler  hypothesis  suggests  that  there  would  be  little  advantage 
in  binaural  interaction.  These  bats  continue  to  fly  with  considerable  skill 
with  one  ear  plugged,  unlike  the  Vespertilionidae,  but  if  ear  movements 
are  surgically  suppressed  a  bat  with  otherwise  normal  hearing  is  dis¬ 
orientated. 


Figure  9  The  Magnitude  of  Doppler  Shifts  in  Echolocation.  The  fre¬ 
quency  of  the  emitted  signal  must  be  multiplied  by  the  Doppler  factor  to 
obtain  the  echo  frequency  at  a  given  relative  velocity. 


One  further  possibility,  which  does  not  appear  to  be  used  by  any  bat, 
is  shallow  sinusoidal  frequency  modulation  with  a  variable  period  of 
modulation.  If  the  periodicity  be  adjusted  so  that  the  Doppler  beat  note 
is  constant,  the  range  of  the  target  is  equal  to  half  the  wavelength  of  the 
modulation  frequency.  This  mechanism  has  the  disadvantage  that  it  re¬ 
quires  controlled  variation  of  the  transmitter  and  can  be  applied  only  to  one 
target  at  a  time.  But  it  does  show  that  no  information  need  be  lost  when 
using  the  Doppler  principle. 

The  whole  beat  note  idea  is  still  pure  hypothesis.  It  offers  feasible  ex¬ 
planations  of  many  behaviors  and  attendant  phenomena  shown  by  bats, 
although  it  certainly  cannot  be  the  whole  story.  But  it  has  more  than 
academic  interest  here  for,  to  some  extent  at  least,  each  of  the  proposed 


44 


Man-Machine  Systems 


A 


>- 

U 

z 

LU 

Z> 

O 

LU 

CL 

U_ 


T  IME  - > 

Figure  10  The  Construction  of  Beat -Notes  in  a  Receiver  Fitted  with 
Vibrating  Reflectors  for  Constant  Frequency  Echoes 

systems  can  be  made  to  work  for  man.  At  an  early  stage  in  development 
of  these  ideas,  it  became  clear  that  if  the  human  ear  were  provided  with 
a  source  of  bat-like  sounds  and  a  nonlinear  detector,  it  should  be  able 
to  supply  the  rest  of  the  bat’s  apparatus.  A  simple  model  was  built  to  see 
if  we  ourselves  could  accomplish  what  we  were  suggesting  for  the  bat  (see 
Figure  1 1 ) .  A  generator  and  loudspeaker  produce  either  constant  frequency 
or  frequency  modulated  sounds,  microphones  and  amplifiers  receive  the 
echoes,  mix  them  with  the  emission,  and  the  detected  beat  notes  are  heard 


NOTES  (^predominant  f  =  IOOcps) 


Figure  1 1  A  ‘Model  Bat’  Which  Tests  the  Beat-Note  Hypothesis  and  May 
Form  the  Basis  for  a  Mobility  Aid  for  the  Blind 


45 


Bat  and  Ultrasonic  Principles  I 

through  binaural  headphones.  A  few  minutes  spent  playing  with  this  model 
is  more  convincing  than  any  formal  demonstration.  (A  short  motion  pic¬ 
ture  film  has  been  made  which  gives  some  idea  of  its  responses.) 

At  this  point  it  should  be  said  that  Mr.  Kay  (5,  6)  and  I  (11,  12) 
evolved  essentially  similar  ideas  about  bats,  and  we  both  built  models 
simultaneously  but  unknown  to  each  other.  The  basic  similarity  in  our 
machines  will  be  apparent  later,  but  Mr.  Kay,  in  aiming  directly  at  a  blind 
aid,  has  designed  electronic  refinements  that  the  bat  could  probably  not 
afford.  This  is  not  to  say  that  we  can  do  better  than  the  bat,  for  the  whole 
development  of  this  creature  is  organized  around  its  unusual  way  of  life 
and  sensory  capacities.  Its  brain  may  be  far  better  equipped  and  is  cer¬ 
tainly  more  specialized  to  use  sound  clues  than  ours.  It  is  much  more 
flexible  than  our  machines  can  ever  be  and  it  probably  has  other  mecha¬ 
nisms  at  its  disposal  which  further  study  may  reveal.  But  the  simple  model 
does  appear  to  improve  on  earlier  devices  by  providing  detailed  informa¬ 
tion.  It  is  yet  to  be  determined  whether  the  human  mind  is  able  to  learn  to 
form  a  spatial  impression  from  frequency  information  since  this  represents 
an  unnatural  coding  to  our  ears.  While  the  cochlea  can  effect  the  analysis, 
our  central  organization  may  not  be  flexible  enough  to  achieve  the  resyn¬ 
thesis.  Only  careful  experimentation  can  answer  this  question.  Nevertheless, 
comparison  between  the  bat  and  the  model  does  suggest  some  factors  to 
be  taken  into  consideration  when  designing  pulses  for  use  by  man. 

Our  ears  cannot  hear  the  full  range  of  beat  notes  which  may  be  pre¬ 
sented  to  the  bat.  We  may  either  restrict  the  frequency  sweep  accordingly 
(as  was  done  in  the  demonstration  film),  or  extend  the  sweep  beyond  an 
octave,  relying  on  our  ears  to  eliminate  true  echoes  and  the  longer  range, 
higher  pitched  beat  notes.  This  is  wasteful  of  energy,  but  has  the  ad¬ 
vantage  that  each  beat  note  will  last  longer,  will  be  easier  to  pitch,  and  will 
sound  more  pleasant.  But  the  user  may  not  be  prepared  to  devote  the 
whole  auditory  range  to  the  device.  The  upper  frequency  bands  can  be  freed 
progressively,  either  by  limiting  the  frequency  sweep  further  or,  preferably, 
by  filtering  off  the  higher  pitched  beat  notes.  Ranging  acuity,  beat  note 
duration,  maximum  range,  and  auditory  demand  can  each  be  adjusted  by 
design  of  the  transmitted  pulse,  but  only  at  the  expense  of  each  other  since 
all  are  interrelated. 

It  is  doubtful  that  binaural  operation,  using  a  wide  beam  transmitter 
and  receivers,  could  be  learned  by  man.  It  may  be  too  confusing.  Also,  the 
binaural  mode  demands  two  earphones  or  inserts,  which  would  obstruct 
normal  hearing.  A  single  channel  device  can  use  a  bone  conductor  on  the 


46  Man-Machine  Systems 

skull  to  leave  both  ears  free.  With  a  narrow  beam  single  channel  arrange¬ 
ment,  directional  information  can  be  supplied  by  proprioceptors:  in  the 
arm  for  a  hand-held  device,  or  on  the  neck  for  apparatus  worn  on  the 
head.  It  should  be  possible  to  build  the  slim  shiny  transmitting  and  re¬ 
ceiving  transducers  into  the  ‘lenses’  of  silvered  sunglasses,  with  a  bone  con¬ 
ductor  behind  the  ear  on  the  end  of  one  sidepiece.  This  would  combine  the 
frequency  modulation  of  Vespertilionids  with  the  beaming  and  proprio¬ 
ception  of  Rhinolophids.  It  does  not  seem  likely  that  Doppler  or  ‘artificial 
Doppler’  modes  will  be  profitable  when  applied  to  a  mobility  device,  but 
neither  should  be  forgotten  completely  since  they  may  be  useful  for  special 
applications. 

Here  both  biological  and  applied  speculation  get  beyond  bounds  for 
the  present.  It  is  more  worthwhile  to  examine  some  other  echolocating 
animals.  There  are,  of  course,  more  bats  than  the  two  families  already  dis¬ 
cussed,  and  many  of  them  produce  sounds  which  have  different  character¬ 
istics.  The  two  general  types  of  pulse  can  be  distinguished  throughout,  but 
some  bats  combine  the  constant  frequency  and  falling  frequency  phases  to 
a  greater  extent  than  do  Rhinolophids.  Some  show  cyclical  amplitude  modu¬ 
lation,  others  regularly  exhibit  marked  harmonic  content  (4)  which  would 
add  harmonics  to  the  beat  notes  too.  The  possible  implications  of  some 
of  these  features  have  not  been  assessed,  but  at  present  they  appear  to 
offer  no  immediate  improvements  in  blind-aid  developments. 

None  of  the  other  animals  which  use  active  acoustic  orientation  pro¬ 
duce  sounds  with  the  simple  harmonic  content  of  Microchiropteran  bats. 
One  genus  of  the  Megachiroptera,  or  Old  World  fruit  bats,  navigates  with 
great  skill  using  clicks  produced  with  the  tongue.  At  least  two  groups  of 
birds  produce  click-like  sounds  of  complex  structure  when  flying  in  dark 
caves.  These  are  the  oil  birds,  Steatornis,  of  South  America,  and  the  swiftlets 
called  Collocalia  of  Southeast  Asia.  Neither  appears  to  achieve  high  acuity 
and  vision  is  used  when  possible.  The  beat  note  hypothesis  cannot  be  applied 
to  any  of  these  animals,  and  it  is  not  clear  how  they  could  overcome  the 
problems  of  echolocation.  Indeed,  they  may  not  overcome  them  com¬ 
pletely,  for  none  of  them  uses  the  faculty  to  capture  its  food  in  flight  as 
does  the  insectivorous  bat. 

The  closest  rivals  of  the  bat  known  at  present  are  the  porpoises  and 
dolphins,  whose  full  capabilities  are  probably  still  unrealized.  Work  by 
Kellogg,  Lilly,  Norris  and  others  (7,  8,  10)  has  shown  that  they  use  echolo¬ 
cation  with  great  precision  in  a  medium  where  sound  travels  nearly  five 
times  as  fast  as  in  the  air.  The  orientation  sounds  are  said  to  be  click-like, 


47 


Bat  and  Ultrasonic  Principles  I 

and  analyses  indicate  short  duration  and  a  wide  spectrum  of  frequencies. 
Despite  the  very  large  vocabulary  of  the  animals  it  is  now  improbable  that 
other  sounds  of  pure  quality  will  be  found  to  have  a  normal  navigational 
function.  Again  the  beat  note  idea  must  be  rejected,  but  two  considerations 
suggest  themselves  as  possible  partial  explanations.  First,  despite  the  factor 
of  five  in  sound  velocity,  the  porpoise  and  its  world  are  more  than  five 
times  the  size  of  the  bat  and  its  immediate  environment.  Secondly,  the 
auditory  nerve  of  the  porpoise  is  composed  of  very  large  fibers  which 
should,  by  the  laws  of  neurophysiology,  confer  the  properties  of  high  con¬ 
duction  velocity  and  discharge  rates.  The  enormous  brain  may  also  allow 
of  complexities  not  available  to  the  bat.  A  small  insectivorous  bat  may 
weigh  only  3.5  gm  when  adult,  about  eight  of  them  to  the  ounce,  all  weigh¬ 
ing  perhaps  less  than  one  auditory  nerve  trunk  of  a  porpoise. 

Deputing  responsibility  to  the  brain,  even  a  large  brain,  is  no  answer 
to  physical  problems  or  the  limitations  of  sensory  nerves,  and  speciali¬ 
zation  of  internal  anatomy  cannot  help  the  subject  of  these  papers.  It  re¬ 
mains  to  be  seen  whether  porpoises  can  suggest  a  means  of  improving  our 
own  sonic  guidance  devices. 

The  reception  and  interpretation  of  short-range  echoes  imposes  several 
problems  of  sensory  physiology,  but  a  review  of  the  phenomena  shown 
by  bats  suggests  a  mechanism  by  which  these  problems  may  be  overcome. 
The  design  of  the  probing  sounds  allows  a  transformation  of  the  echo  in¬ 
formation  into  a  form  acceptable  to  the  cochlea.  With  the  aid  of  a  simple 
apparatus,  the  human  ear  is  able  to  achieve  accurate  discrimination  in  all 
three  proposed  modes,  one  of  which  shows  promise  as  a  mobility  aid.  No 
other  echolocating  animal  offers  a  suggestion  for  further  improvement  in 
this  field  as  present. 

REFERENCES 

1.  Dijkgraaf,  S.,  “Sinnesphysiologische  Beobachtungen  an  Fledermaiisen,”  Acta 

Physiol.  Pharm.  Neerl.,  Vol.  6  (1957),  p.  675. 

2.  Griffin,  D.  R.  Listening  in  the  Dark.  New  Haven,  Connecticut:  Yale  University 

Press,  1958. 

3.  Griffin,  D.  R.  Echoes  of  Bats  and  Men.  New  York:  Doubleday,  1959. 

4.  Griffin,  D.  R.,  and  A.  Novick,  “Acoustic  Orientation  in  Neotropical  Bats,’  J.  Exp. 

Zool.,  Vol.  130  (1955),  pp.  230-251. 

5.  Kay,  L.,  “A  Plausible  Explanation  of  the  Bat’s  Echolocation  Acuity,”  Anim. 

Behav.,  Vol.  10  (1962),  pp.  34-41. 

6.  Kay,  L.,  “Active  Energy  Radiating  Systems:  Ultrasonic  Guidance  for  the 

Blind,”  in  L.  L.  Clark  (ed.)  Proceedings  of  the  International  Congress  on 

Technology  and  Blindness,  Vol.  I.  New  York:  American  Foundation  for  the 

Blind,  1963. 


48 


Man-Machine  Systems 

7.  Kellogg,  W.  N.  Porpoises  and  Sonar.  Chicago:  University  of  Chicago  Press,  1961. 

8.  Lilly,  J.  C.  Man  and  Dolphin.  London:  Gollancz,  1962. 

9.  Mohres,  F.  P.,  “Ober  die  Ultraschallorientierung  der  Hufeisennasen  (Chiroptera 

— Rhinolophinae),”  Z.  Vergl.  Physiol.,  Vol.  34  (1953),  p.  547. 

10.  Norris,  K.  S.,  J.  H.  Prescott,  P.  V.  Asa-Dorian,  and  P.  Perkins,  “An  Experi¬ 

mental  Demonstration  of  Echo-Location  Behavior  in  the  Porpoise,  Tursiops 
truncatus  (Montagu),”  Biol.  Bull.,  Vol.  120  (1961),  pp.  163-176. 

11.  Pye,  J.  D.,  “A  Theory  of  Echolocation  by  Bats,”  /.  Lryng.  Otol.,  Vol.  74  (1960), 

pp.  718-729 

12.  Pye,  J.  D.,  “Echolocation  by  Bats,”  Endeavour,  Vol.  20  (1961),  pp.  101-111. 


ACTIVE  ENERGY  RADIATING  SYSTEMS: 

THE  BAT  AND  ULTRASONIC  PRINCIPLES  II; 
ACOUSTICAL  CONTROL  OF 
AIRBORNE  INTERCEPTIONS  BY  BATS* 

FREDERIC  A.  WEBSTER 

Research  Consultant,  Cambridge,  Massachusetts 


Why  should  a  paper  on  airborne  interceptions  by  bats  form  part  of  a  sym¬ 
posium  on  Man-Machine  Systems  and  Mobility  Devices  for  blind  persons? 
Why  bats?  Why  interceptions?  The  broad  reason  why  so  many  symposia  on 
blind  guidance  include  discussions  of  the  general  techniques  of  bats  is  per¬ 
haps  fairly  obvious:  bats  are  acoustically  guided  creatures — and  very  suc¬ 
cessfully  so.  They  must,  therefore,  be  capable  of  extremely  proficient 
acoustical  orientation.  Man  might  learn  a  lot  about  effective  acoustical 
guidance  by  a  study  of  their  methods.  But  why  the  focus  on  interceptions 
of  airborne  targets?  The  reason,  again,  is  very  straightforward.  Airborne 
interceptions  constitute  the  focal  point  around  which  a  major  portion  of 
the  bat’s  techniques  have  developed:  insectivorous  bats,  after  all,  survive 
in  large  measure  by  the  capture  of  flying  insects.  In  seeking  a  truly  com¬ 
prehensive  grasp  of  the  bat’s  methods  and  the  rationale  behind  them,  the 
central  components  in  their  evolution  cannot  be  by-passed. 

The  structure  of  the  present  paper  was  thus  dictated  by  the  following 
major  considerations. 

1)  Using  auditory  reference  alone  (or  virtually  so),  bats  execute 
almost  preposterously  quick  and  accurate  maneuvers,  often  under 

*  This  work  was  performed  in  part  under  a  subcontract  with  MIT  Lincoln 
Laboratory,  which  is  operated  with  support  from  the  U.  S.  Army,  Navy,  and  Air 
Force.  The  observations  reported  represent  a  partial  survey  of  investigations  carried 
out  in  conjunction  with  many  individuals,  most  notably:  D.  A.  Cahlander,  D. 
Dunning,  N.  Durlach,  J.  Friend,  D.  R.  Griffin,  G.  Horne,  K.  D.  Roeder,  and  A.  E. 
Treat.  Others  providing  important  assistance  of  many  sorts  include:  D.  W.  Batteau, 
A.  Boass,  A.  Carpenter,  J.  Cope,  H.  E.  Edgerton,  C.  Gifford,  T.  Gregg,  A.  Grinnell, 
H.  Hitchcock,  L.  Kay,  A.  Lagon,  C.  Michael,  J.  J.  G.  McCue,  A.  W.  Mills,  A.  Novick, 
and  J.  D.  Pye. 


49 


50  Man-Machine  Systems 

severe  constraints  imposed  by  complex  physical  surroundings.  They 
do  this  with  equipment  that  is,  perhaps,  thousands  of  times  smaller 
than  the  corresponding  equipment  of  the  human  auditory  system; 
vastly  smaller  still  than  any  effective  artificial  system.  Their  methods 
and  their  mechanisms  must,  therefore,  warrant  thorough  study  by 
those  seeking  oriented  guidance  of  human  beings  or  of  artificial  systems 
by  nonvisual  means. 

2)  Tests  have  shown,  however,  that  direct  conversion  of  the  more 
obvious  features  of  a  bat’s  signal  system  to  the  frequency  range  of 
human  hearing  does  not  in  itself  provide  an  adequate  basis  either  for 
interpretation  by  the  human  auditory  system  or  for  evaluation  of  the 
environmental  features  essential  in  human  applications.  More  must  be 
known  about  how  the  human  auditory  system  deals  with  problems 
akin  to  those  of  the  bat;  about  how  the  auditory  system  of  the  bat 
deals  with  the  information  it  receives,  and  about  the  interrelations 
among  emitted  signal,  physical  problem,  auditory  analysis,  and  motor 
control  in  the  bat’s  situation. 

3)  There  are  many  avenues  of  approach  to  the  bat’s  system: 
histological,  neurophysiological,  ecological,  psychophysical,  and  be¬ 
havioral,  among  others.  Behavioral  analysis  of  the  bat’s  performance 
on  central  problems  of  its  survival  is  certainly  a  vital  one.  The  present 
discussion  undertakes  an  introductory  survey  of  the  bat’s  performance 
and  techniques  in  the  capture  of  its  prey.  As  is  to  be  expected,  however, 
such  analysis  leads  to  a  regress:  the  receding  succession  of  questions 
that  deal  with  how  the  bat’s  system  came  to  be  evolved  in  its  present 
form.  In  the  evaluations  of  its  quickness,  precision,  and  reliability  we 
learn  much  about  its  remarkable  performance  and  how  this  is  achieved; 
but  in  its  flaws,  its  errors,  and  its  inadequacies  we  gain  additional  in¬ 
sights  into  the  forces  that  make  it  what  it  is.  The  present  paper  attempts 
to  put  the  bat’s  methods  and  accomplishments  in  this  perspective. 


INTRODUCTION 

Insectivorous  bats,  equipped  with  a  computational  system  little  larger  than 
the  tip  of  a  pencil,  use  acoustically  triggered  mechanisms  to  guide  their  in¬ 
terceptions  of  insect  targets.  They  do  so  with  a  quickness,  precision,  and 
infallibility  that  often  leaves  our  largest  and  fastest  electronic  equipment 
far  behind.  Sometimes  all  the  insects  to  be  caught  are  edible,  readily  de¬ 
tected,  and  flying  slowly  in  an  unobstructed  area.  Such  pursuits  are  easy.  At 
other  times,  however,  the  interception  situation  is  vastly  more  complex  and 
difficult.  The  insects  may  be  as  small  as  one-fifth  milligram  (e.g.,  gnats) 
or  as  elusive  as  the  deftly  evading  noctuid  moths;  some  may  be  distasteful 
or  dangerous;  and  often  the  areas  in  which  the  insects  must  be  caught  are 
intricately  laced  with  foliage,  or  lined  with  destructive  projections  such  as 


51 


Bat  and  Ultrasonic  Principles  n 

thorns  and  twigs.  How  can  a  bat,  under  such  precarious  conditions,  engage 
in  a  thousand  pursuits  a  night,  sometimes  with  over  90  percent  success, 
and  yet  survive  unharmed  for  20  years? 

History  reveals  a  strange  lack  of  imagination  and  initiative  in  dealing 
with  this  intriguing  problem  (3,  8).  The  outstanding  early  work  was  done  by 
Lassaro  Spallanzani  and  several  contemporaries,  starting  about  1793.  Spal¬ 
lanzani,  for  example,  discovered  that  bats  which  had  been  blinded  found 
their  way  back  to  their  roosts  as  well  as  did  nonblinded  bats,  and  that  they  ap¬ 
peared  to  catch  just  as  many  insects.  Moreover,  Ree’s  Cyclopaedia  of 
1819,  referring  to  experiments  done  in  the  late  1790’s,  says  that  according 
to  “Professor  Jurin  of  Geneva  .  .  .  neither  the  touch,  nor  ear,  nor  smell, 
nor  taste  is  .  .  .  sufficient  to  supply  the  want  of  sight;  but  from  some 
anatomical  investigations  of  these  animals,  he  concluded  that  a  very  large 
proportion  of  nerves  is  expanded  on  the  upper  jaw,  the  muzzle,  and  the 
organ  of  hearing;  and  these  appeared  to  him,  in  a  great  degree,  to  account 
for  the  extraordinary  faculty  .  .  (26).  Before  1800,  in  other  words,  the 

ears  and  mouth  were  jointly  implicated  in  the  mechanisms  of  orientation 
and  pursuit.  Moreover,  the  relevant  observations  were  publicly  available. 
Yet  for  almost  a  century  and  a  half,  in  an  era  of  great  scientific  curiosity  and 
growth,  no  further  progress  of  account  was  made. 

Long-delayed  recognition  of  the  fact  of  sensory  detection  beyond  the 
range  of  human  sensitivity,  together  with  instrumentation  for  the  relevant 
measurements,  finally  reopened  the  door  to  the  bat’s  orientation  secrets.  But 
it  took  an  imaginative  undergraduate  at  Harvard  College  to  initiate  the 
crucial  tests.  Shortly  before  the  Second  World  War,  Donald  R.  Griffin,  with 
the  collaboration  of  Robert  Galambos,  began  a  systematic  study  of  the  bat’s 
ultrasonic  signals.*  These  investigators  did  not,  however,  study  pursuit  be¬ 
havior  at  that  time.  Rather,  they  studied  another  major  area:  the  avoidance 
of  obstacles  (4,  10).  Such  behavior  was  more  easily  brought  into  the 
laboratory  and  much  more  readily  adapted  to  quantitative  evaluation.  Their 
work  amply  demonstrated  that  ultrasonic  mechanisms  alone  accounted 
adequately  for  all  the  measured  obstacle  avoidance. 


*  Griffin  notes,  with  respect,  that  the  Dutch  zoologist,  Sven  Dijgraaf,  determined 
independently  of  the  work  being  done  by  himself  and  Galambos — and  with  his 
unaided  ears  alone — that  the  faint  audible  component  of  the  bat’s  orientation  signals 
was  directly  related  to  obstacle-avoidance.  By  testing  the  effects  of  ear-plugging, 
mouth-blocking,  and  sensory  denervation  of  the  wings,  Dijgraaf  established  that 
echolocation  was  the  bat’s  means  of  orientation — and  did  so  with  no  significant  in¬ 
strumentation  beyond  that  which  had  existed  for  centuries  (8,  p.  76). 


52  Man-Machine  Systems 

Actually,  the  avoidance  of  specific  obstacles  is  part  of  a  somewhat 
branched  continuum  of  oriented  behavior.  At  one  end  of  this  continuum 
are  the  general  surrounding  configurations  which  shape  the  bat’s  over-all 
flight  path.  Here,  for  example,  are  the  leaves  and  small  branches  of  trees 
which  constitute  specific  obstacles  by  themselves,  but  under  many  condi¬ 
tions  reach  a  density  and  extent  which  a  bat  evaluates  as  a  total  grouping. 
Here  it  is  the  total  configuration  rather  than  the  specific  obstacle  that  con¬ 
strains  the  bat’s  direction  of  flight.  At  the  other  end  of  the  continuum  are 
small,  discrete,  moving  objects — namely  flying  insects — which  are  to  be 
pursued  and  caught.  What  about  a  falling  leaf?  Such  an  object  is  well  down 
the  continuum  from  a  tree,  or  even  from  the  end  of  one  of  its  branches.  Yet 
a  leaf  is  ordinarily  not  as  small  as  an  insect,  nor  does  it  have  the  same  attri¬ 
butes  of  motion.  Is  it  to  be  avoided  or  pursued?  As  will  become  evident 
below,  categorical  answers  to  such  questions  are  not  possible,  for  the  zone 
of  transition  between  obstacles  and  food  targets  is  large,  and  many  factors 
may  influence  a  bat’s  decision. 

If  we  assume  for  the  moment  that  evolutionary  forces  have  committed 
the  bats  under  discussion  to  survival  by  acoustically  guided  captures  of 
flying  insects,  we  must  next  look  for  the  elements  that  may  have  shaped  the 
bat’s  methods  and  techniques.  In  other  words,  we  must  gain  a  frame  of 
reference  defined  by  the  basic  nature  of  the  bat  and  by  the  tasks  it  has  to 
perform.  First  of  all,  we  must  inquire  into  the  nature  of  the  bat  itself:  its 
physical  structure,  its  operating  limits  in  terms  of  flightspeed,  reaction 
time,  maneuverability,  and  so  on.  We  must  also  explore  the  nature  of  its 
signals,  echo-reception  devices,  and  processing  mechanisms.  Second,  we 
must  discover  how  it  actually  goes  about  the  maneuvers  by  which  it  sur¬ 
vives,  how  it  decides  what  to  catch  and  what  to  avoid,  and  how  it  governs 
the  interception  procedures  it  uses.  Third,  we  must  learn  something  about 
the  targets  themselves:  their  size,  their  reflective  changes  as  a  result  of 
wing  action  and  relative  orientation,  their  flight  velocities  and  flight  pat¬ 
terns;  also,  special  attributes  that  may  modify  the  bat’s  problems  of  identi¬ 
fication,  selection,  and  pursuit.  Finally,  we  must  note  effects  of  the  sur¬ 
rounding  situation:  how,  for  example,  the  bat  deals  with  the  intricate 
spatial  relations  that  often  constrain  the  course  of  interception.  All  such 
questions  obviously  break  down  into  innumerable  subquestions.  Although 
answers  to  some  of  these  subquestions  are  pretty  well  known  and  will  be 
illustrated  below,  many  are  virtually  unexplored;  and  certainly  all  must  be 
answered  within  the  broad  frame  of  reference  given  by  the  bat’s  basic  nature 
and  by  the  total  nature  of  its  task. 


Bat  and  Ultrasonic  Principles  n 


53 


GENERAL  NATURE  OF  BATS,  THEIR  SIGNALS, 

AND  THEIR  INTERCEPTION  PROBLEMS 
Dominant  Characteristics  of  Bats 

Physical  Characteristics  and  Their  Implications.  Physically,  the  insectivorous 
bat  of  the  temperate  regions  is  small  (body  length  roughly  5  to  8  centi¬ 
meters),  light  (roughly  5  to  15  grams),  and  flexible.  Such  properties  ob¬ 
viously  adapt  it  well  to  its  particular  mode  of  life.  They  permit  it  to  maneu¬ 
ver  fast  and  respond  quickly.  The  low  inertia  of  the  bat’s  physical  com¬ 
ponents  permits  high  acceleration  of  structures  such  as  the  wings;  and  its 
lightness  is  such  that  the  air  readily  provides  the  resistance  required  for  rapid 
aerial  maneuvers.  Moreover,  neural  conduction  takes  less  time  over  short 
pathways.  Such  properties,  however,  also  dictate  in  large  measure  the 
maximum  admissible  size  of  the  analytical,  evaluating,  and  integrating 
systems.  Were  a  bat’s  brain  size  to  increase  greatly,  its  total  weight  would 
tend  to  increase  also.  Without  radical  new  adaptations  the  bat’s  maneuver¬ 
ability  and  quickness  might  thus  fall  off  and  the  bat  would  be  forced  to  rely 
on  interceptions  predicted  further  in  advance.  But  the  complexities  of  long 
range  prediction  tend  to  multiply  enormously.  In  the  bat’s  situation,  they 
would  call  for  disproportionate  increases  in  the  powers  of  analysis  and  eval¬ 
uation.  Varied,  but  simple,  tactics  of  evasion  and  deceit  by  the  various  mem¬ 
bers  of  the  insect  world  might  produce  a  virtually  insurmountable  array  of 
problems  in  selection  and  prediction.  To  make  matters  worse,  the  increasing 
size  of  the  bat  would  enhance  its  nutritional  needs  and  once  more  aggravate 
the  problem.  Some  bats  appear  to  have  found  a  way  out  of  this  dilemma 
by  adapting  to  new  foods:  fruit,  nectar,  and  fish,  for  example.  In  the 
present  discussion,  however,  we  are  concerned  with  bats  that  have  re¬ 
mained  insect  eaters.  It  is  clear  that  such  bats  use  their  natural  attributes  to 
excellent  advantage;  they  have  not  only  survived  well  in  the  course  of 
evolution,*  but  they  also  live  long  as  individuals. 

Some  Other  Attributes  of  Bats.  Questions  about  speed,  maneuverability, 
and  reaction  time  will  be  partially  answered  in  the  illustrations  presented 
later.  By  way  of  preliminary  comment,  it  can  be  said  that  flight  speeds  vary 
from  0  ft/sec  for  hovering  bats  like  Plecotus  (as,  for  example,  during  the 
exploration  of  some  detail  of  its  surroundings;  see  Figure  1),  to  perhaps 
35  or  more  ft/sec  for  the  red  bat,  Lasiurus  borealis,  during  pursuit.  In  the 

*  Bats  appear  to  have  remained  basically  unchanged  for  about  50  million  years 
(see,  for  example,  8,  p.  6). 


54  Man-Machine  Systems 


Figure  1  Plecotus  townsendii  Hovering.  Plecotus  bats  are  noted  chiefly  for  their 
very  large  ears  and  associated  low-intensity  signals.  Another  striking  feature  is  their 
capacity  to  hover,  somewhat  after  the  fashion  of  the  hummingbird.  While  hovering 
a  half-dozen  inches  from  a  surface,  these  bats  can  apparently  make  certain  detailed 
evaluations  of  configurations  on  the  surface.  They  detect  stationary  insects  and  com¬ 
monly  are  able  to  make  precise  localization  of  mealworms  held  in  the  fingers.  They 
have  also  been  observed  to  initiate  pursuit  of  insects  flying  very  close  to  a  surface. 
Pulse  durations  while  hovering  are  of  the  order  of  one  millisecond. 

laboratory,  flight  speeds  normally  range  from  about  8  to  20  ft/sec  (roughly 
2  to  6  meters/sec).  Maneuverability  is  astonishing.  Figure  2,  for  example, 
shows  a  wild  red  bat  which  in  roughly  one-third  of  a  second  of  pursuit 
rolled  onto  its  back,  dove,  rolled  upright,  and  captured  a  moth  that  was 
spiraling  toward  the  ground.  Response  times  have  not  been  systematically 
measured,  but  high  speed  films  suggest  that  observable  reactions  to  un¬ 
expected  events  may  sometimes  take  place  in  as  little  as  one-thirtieth  of  a 
second.  The  remarkable  and  violent  maneuvering  of  many  insects  demands 
quick  appreciation  of  trajectory  and  a  rapid  response  to  sudden  shifts  of 
the  target’s  path. 

General  Nature  of  Bats’  Signals.  From  what  has  already  been  said,  it  is 
clear  that  the  echolocation  system  of  bats  has  extremely  severe  demands 
upon  its  capacities.  For  example,  it  must  allow  a  very  rapid  rate  of  data 
input;  it  must  permit  excellent  resolution  of  detail,  and  it  must  facilitate 


55 


Bat  and  Ultrasonic  Principles  n 

extraordinarily  quick  handling  of  rapidly  shifting  spatial  relations.  For  a 
bat’s  interceptions  to  be  successful,  in  other  words,  the  evaluations  made 
by  the  bat  must  not  only  be  very  quick  but  they  must  often  be  extremely 


0  10  50  /oo  /so  CM 

t - . - 1  .  »  *  -  *  «  _ _  -  I _ _ _ . - A. - *  ■■  * 

Figure  2  Capture  of  Downward  Spiraling  Moth  by  the  Red  Bat  Lasiurus  borealis. 
Approaching  at  a  speed  of  about  20  feet  (6  meters)  per  second,  the  bat  rolled  onto 
its  back,  dove,  righted  itself,  and  captured  the  moth — all  in  about  a  third  of  a  second. 
Significant  also  is  the  fact  that  this  interception  was  so  close  to  the  ground  (about 
three  feet)  that  the  form  of  the  interception  had  to  be  shaped  to  the  configuration 
of  surface  obstacles.  (This  figure  is  traced  from  a  multiflash  sequence  with  a  flash 
rate  of  about  10/second.  Except  where  noted,  subsequent  multiflash  pictures  were 
made  at  this  rate,  plus  or  minus  about  1/2.  Corresponding  pictures  of  bat  and  target 
are  normally  numbered,  with  the  larger  number  representing  the  bat,  and  the  smaller 
number  the  target.)  This  picture  was  made  in  collaboration  with  A.  E.  Treat. 


56  Man-Machine  Systems 

precise.  Clearly  the  bat’s  signals  must  provide  for  efficient  sorting  of  input 
data  at  high  data  rates,  excellent  resolution  of  detail,  and  remarkable  re¬ 
sistance  to  confusion  in  complex  situations. 

Interestingly  enough,  different  bats  have  evolved  rather  different  signal 
structures.  Indeed  the  only  features  common  to  all  are:  (1)  pulsed  form, 
(2)  high  average  frequency  (mostly  20  to  120  kc/sec),  (3)  variable  pulse 
duration,  (4)  variable  repetition  rate  (duty  cycle  varies  from  roughly  2  per¬ 
cent  to  90  percent  in  different  bats  and  different  situations),  and  (5) 
sinusoidal  or  harmonic  structure  of  the  carrier  (as  opposed  to  a  click  or 
noise  structure).  Most  outstanding  among  these  features,  perhaps,  are:  (1) 
the  high  frequencies  of  the  carrier  and  (2)  the  variability  of  the  different 
functional  units  that  make  up  the  signal. 

Among  the  more  important  differences  the  following  are  conspicuous. 
Certain  bats,  notably  the  Old  World  horseshoe  bats,  emit  very  long  pulses 
(up  to  50  milliseconds  or  more)  which  are  extremely  constant  in  frequency 
over  most  of  their  duration.  Other  bats,  by  contrast,  emit  pulses  which 
seldom  exceed  3  milliseconds  in  duration.  From  these  figures  it  is  im¬ 
mediately  apparent  that  certain  bats  must  be  guided  chiefly  by  echoes 
which  are  received  while  the  pulses  are  being  emitted,  while  others  must 
be  guided  primarily  by  echoes  received  during  the  quiet  intervals  between 
pulses.  Moreover,  since  the  long  pulses  often  have  a  very  gradual  beginning 
and  ending,  as  well  as  almost  no  modulation,  they  contain  virtually  no 
intrinsic  time  reference,  whereas  the  short  pulses  normally  incorporate 
either  a  very  rapid  frequency  sweep,  from  high  to  low,  or  they  include  a 
variable  harmonic  structure.  Either  of  these  can  provide  an  effective  time 
reference.  Enormous  variations  in  signal  intensity  are  also  seen  among 
the  different  bats  (8,  13,  20).  A  microphone  placed  1  foot  in  front  of  the 
point  of  signal  emission  may  show  intensities  as  high  as  105  db  (re  0.0002 
dynes/cm2)  for  the  greater  horseshoe  bat  and  as  low  as  60  db  for  Plecotus, 
or  less  than  55  db  for  the  whispering  bat  Carollia.  Frequency  range,  as 
already  suggested,  also  differs  greatly  from  bat  to  bat;  moreover,  it  may 
shift  markedly  during  the  course  of  maneuvers  by  a  given  bat,  though 
such  shifts  also  vary  from  bat  to  bat.  Despite  these  and  other  variations, 
all  insectivorous  bats  seem  capable  of  remarkable  speed  and  precision 
in  the  use  of  their  signal  indications. 

To  understand  how  a  bat  uses  its  signals  to  guide  its  interceptions  of 
insects,  we  must  first  know  how  a  bat  goes  about  intercepting  and  capturing 
its  targets.  But  we  still  have  to  go  back  another  step.  The  reason  why  a 
bat  catches  as  it  does  derives  from  the  nature  of  the  targets  it  has  to  catch 
and  from  the  characteristics  of  their  flight  paths.  Before  investigating  the 


Bat  and  Ultrasonic  Principles  n  57 

details  of  the  bat’s  techniques,  therefore,  something  should  be  known  about 
the  nature  of  the  targets  and  their  actions. 

Some  Properties  of  the  Bads  Targets 

Target  Size.  While  the  varying  signals  of  the  different  bats  might  logically 
be  described  in  terms  of  requirements  for  the  pursuit  of  different  kinds  of 
insects,  no  such  clear-cut  categories  have  been  established.  At  the  same 
time,  observational  evidence  suggests  that  many  slower  flying  bats  may  select 
insect  targets  of  a  large  variety  of  sizes,  from  2-millimeter  gnats  (of  roughly 
1/5-milligram  weight)  to  moths  of  about  50-millimeter  wing  span  (roughly 
200  milligrams  in  weight)  or  more.  Faster  flying  bats,  on  the  other  hand, 
appear  to  select  mostly  larger  insects.  This  latter  is  certainly  to  be  ex¬ 
pected.  A  very  small  insect  would  not  produce  an  echo  that  could  be 
detected  at  any  great  distance.  Thus,  by  the  time  a  fast  flying  bat  could  detect 
the  insect,  there  would  be  no  chance  for  a  maneuver  quick  enough  to 
permit  successful  interception.  The  long  pulse  length  of  some  of  the  faster 
bats  would  also  tend  to  preclude  close-range  detection.  An  echo  returning 
from  a  10-millisecond  pulse,  for  example,  begins  to  overlap  the  outgoing 
pulse  when  the  object  distance  decreases  to  about  5  ft.  It  seems  likely  that 
an  insect  too  small  to  give  usable  echoes  at  6  or  7  ft  (about  2  meters) 
will  go  undetected,  or  certainly  unevaluated.  On  the  other  hand,  with  a 
search  pulse  of  only  2  milliseconds  duration,  the  situation  is  quite  different. 
Here  an  unimpeded  echo  comes  back  from  an  object  as  close  as  1  ft  away 
(one-third  of  a  meter).  Moreover,  an  enormous  difference  in  echo  strength 
exists.  As  compared  with  a  target  at  6  ft,  a  like  target  at  1  ft  (for  equivalent 
output  intensity  and  frequency)  produces  an  acoustical  energy  in  the  echo 
roughly  64  or  1300  times  greater  per  millisecond — about  250  times  greater 
in  the  total  2-millisecond  pulse.  As  indicated  above,  the  slower  flying  bat 
could  maneuver  much  more  quickly  to  a  target  detected  at  such  close  range. 
Even  for  those  bats  which  use  pulse  overlap,  small  targets  can  presumably 
be  detected  only  close  at  hand.  Thus,  target  size,  range  of  detection,  and 
flight  speed  must  be  closely  related;  and,  for  most  bats,  pulse  duration  may 
also  influence  the  size  of  targets  selected. 

Target  Trajectories  and  Other  Factors.  Size,  however,  is  not  the  only 
important  attribute  of  a  bat’s  targets.  Flight  speed,  acceleration,  and  pat¬ 
tern  of  maneuver  are  equally  relevant.  Though  the  flight  speeds  of  night- 
flying  insects  tend  to  be  considerably  less  than  the  flight  speeds  of  bats, 
many  insects  are  capable  of  rapidly  accelerated  dartings  and  dives;  some 
execute  abrupt  and  unpredictable  changes  in  flight  direction,  while  others 
carry  out  complex  and  varied  evasive  tactics. 


58  Man-Machine  Systems 

Presumably  a  significant  relation  exists  between  insect  size  and  vio¬ 
lence  of  maneuver.  Fruit  flies,  for  example,  would  not  be  expected  to 
reach  the  speeds,  nor  to  exhibit  the  complexities  of  maneuver,  noted  in 
many  of  the  larger  insects  such  as  moths.  This  situation  may  well  provide 
a  convenient  exchange  relation.  The  bat,  in  other  words,  may  be  free  to 
choose  between  simple  catches  frequently  repeated  or  difficult  catches 
achieved  at  greater  intervals  and  with  greater  effort.  With  small  targets, 
a  bat  can  normally  expect  lower  velocities  and  a  smaller  chance  of  rapid 
maneuver.  This  presumably  simplifies  the  interception  procedure  and  the 
technique  of  catch.  To  avoid  unnecessary  effort  on  such  small  insects,  a 
bat  might  use  techniques  producing  rapid  captures  but  not  certainty  of 
catch.  The  bat’s  objective  here  would  be  to  achieve  the  most  rapid  possible 
succession  of  likely  catches  without  wasting  excessive  time  or  effort  on 
any  one.  With  larger  targets,  the  situation  is  different.  In  terms  of  fuel  and 
other  essentials  gained,  the  value  of  such  a  catch  may  be  100  or  more 
times  greater.  Much  more  effort  and  time  per  catch  are  thus  warranted. 
Since  larger  insects  also  may  execute  more  baffling  maneuvers,  much 
greater  effort  may  be  needed  to  achieve  reasonable  probability  of  catch. 
It  is  also  likely  that  some  insects  are  harmful  or  obnoxious.  Since  this  is 
much  more  likely  to  hold  for  the  large  insects,  careful  selection  and  evalu¬ 
ation  of  individual  larger  targets  may  become  essential.  Just  how  a  bat 
plays  its  hand  so  as  to  achieve  the  highest  probability  of  gaining  essential 
nutrition  with  the  least  effort  and  minimum  risk  is  a  problem  that  may 
turn  out  to  be  very  complex.  Certain  elements  in  the  picture  will  become 
evident  in  the  illustrations  below,  but  many  facets  of  this  intriguing  problem 
remain  to  be  discovered. 

The  Bat-Moth  Battle 

Evolutionary  Nature  of  the  Bat-Moth  Battle.  Possibly  the  most  significant 
and  arresting  of  all  the  bat’s  interception  problems  are  those  which  relate 
to  the  so-called  “bat-moth  battle.”  Their  significance  derives  in  considerable 
measure  from  the  long  evolutionary  history  which  must  have  produced  the 
intricate  tactics  and  countertactics  now  in  evidence.  That  certain  moths, 
notably  the  noctuids  and  geometrids,  respond  to  the  sounds  of  approach¬ 
ing  bats  was  suspected  long  before  actual  measurements  of  the  moths’ 
tympanic  responses  were  made  (29,  34,  37).  The  extent  to  which  the 
techniques  and  tactics  on  the  two  sides  interact,  however,  has  only  recently 
come  to  light.  Certain  general  features  of  the  situation  are  worth  noting. 
The  insects  have  many  more  individuals  and  these  individuals  are  pro¬ 
duced  at  much  more  rapid  rates.  Moreover,  there  are  many  more  kinds 


59 


Bat  and  Ultrasonic  Principles  n 

of  insects  than  bats  and  each  kind  of  insect  may  have  developed  different 
properties  which  serve  to  complicate  and  confuse  the  bat’s  evaluations. 
Being  smaller  and  lighter  than  bats,  yet  large  enough  to  attain  significant 
speed,  many  of  them  can  maneuver  with  remarkable  quickness  and  ac¬ 
celeration.  As  individuals,  however,  the  insects  possess  little  adaptive 
capacity.  Whereas  an  individual  bat  may  learn  to  outwit  a  particular  insect, 
the  individual  insect  can  make  little  progress  in  adapting  to  the  pursuit 
tactics  of  the  bat.  The  “adaptive  loops”  of  insect  species  reside  outside 
the  individuals. 

But  the  bats  have  their  problems  too.  For  example,  if  the  bat  is  to 
survive  for  its  normal  life  expectancy,  which  is  often  10  to  15  years  or 
more,  it  cannot  take  serious  risks.  For  instance,  were  it  to  pursue  its  target 
into  the  twigs  of  a  tree  it  would  be  very  likely  to  tear  its  wings  and  not 
surive  for  long.  Bats,  moreover,  must  contend  with  a  great  deal  of  variety 
among  the  different  members  of  the  insect  world  and  with  corresponding 
diversities  in  their  behavior.  The  bat  must,  therefore,  gain  its  advantage 
through  individual  skill  and  through  precise  and  rapid  evaluation  of  specific 
situations.  At  the  same  time,  the  very  small  size  of  the  bat’s  brain  must 
severely  limit  the  extent  and  complexity  of  its  learning.  Learning  and  adap¬ 
tation  of  individual  bats  is  probably  well  tailored  to  the  more  critical  as¬ 
pects  of  the  echolocation  problems  it  faces.  It  is  thus  not  surprising  that, 
in  the  face  of  relatively  simple-seeming  artificial  problems,  bats  often  ap¬ 
pear  astonishingly  stupid;  yet,  in  the  solution  of  complex  natural  problems, 
they  commonly  seem  incredibly  capable  and  adept. 

The  Moth’s  Detection  of  Sounds  from  Bats.  In  recent  years  much  has 
been  discovered  about  the  moth’s  mechanisms  of  hearing  (27  to  32,  34  to  37). 
Outstanding  features  of  the  moth’s  acoustic  system  are:  (1)  its  great  sensi¬ 
tivity,  (2)  its  simplicity,  and  (3)  its  rough  matching  to  the  wide  band  of 
frequencies  emitted  by  bats.  The  moth’s  sensitivity  to  bat  sounds  is  such 
that  nerve  activity  generated  by  signals  from  a  bat  100  or  more  feet  away  can 
readily  be  picked  up  from  the  moth’s  tympanic  nerve  and  made  audible  to 
human  listeners  (31).  But  the  tympanic  nerve  on  each  side  is  remarkably 
simple:  it  normally  contains  only  two  acoustically  sensitive  fibers  (designated 
A  fibers) — one  that  is  highly  sensitive  and  one  that  is  about  20  db  less  so. 
The  frequency  range  of  a  moth’s  hearing  is  also  remarkable.  In  some  moths 
nerve  responses  are  obtainable  from  sounds  ranging  from  3000  to  over 
150,000  cycles/second.  Other  important  features  have  also  been  noted. 

But  how  does  a  moth  use  its  bat-detection  system?  A  few  specific 
observations  have  been  made  by  Roeder  (29)  and  others,  but,  for  the 


60 


Man-Machine  Systems 


Figure  3  Streak  Pictures  of  Bat-Moth  Encounters.  For  these  pictures,  free-flying 
moths  were  attracted  to  the  area  with  the  use  of  ultraviolet  lights.  When  approaching 


61 


Bat  and  Ultrasonic  Principles  II 

most  part,  existing  evaluations  are  still  largely  guesswork.  Some  of  the 
more  obvious  things  a  moth  might  determine  are:  distance  of  the  bat 
(but  probably  only  the  nearest  bat  if  several  were  present),  direction 
(again  probably  only  of  a  single  bat),  phase  of  interception  (perhaps  by 
judgment  of  intensity  or  pulse  repetition  rate),  and  conceivably  proximity 
to  protective  objects  (as,  for  example,  grass  or  leaves  where  the  moth  can 
hide).  That  a  moth  can  often  judge  the  direction  of  an  approaching  bat 
has  been  suggested  experimentally  by  Roeder  (27).  Indications  are 
that  a  moth  tends  to  turn  so  that  its  axis  is  parallel  to  the  bat’s  and  thus 
produces  minimum  echoes  from  the  bat’s  signals.  This  action  occurs,  how¬ 
ever,  chiefly  when  the  bat  is  at  some  distance.  At  close  range,  the  moth 
seems  capable  of  little  directional  evaluation.  Whether  or  not  a  moth  can 
judge  the  bat’s  phase  of  interception  is  not  yet  established;  but  moths 
often  make  sudden  changes  in  speed  and  direction  roughly  as  the  bat  begins 
its  final  phase  of  attack  (see  Figure  3).  Likewise,  whether  or  not  a  moth 
can  judge  its  distance  to  the  ground  or  to  near-by  shrubbery  is  at  present 
unknown.  One  might  be  tempted  to  guess  that  by  use  of  the  time  intervals 
between  a  bat’s  direct  signal  and  the  corresponding  echoes  from  the  ground 
or  other  objects,  a  moth  might  make  some  estimate  of  the  location  of  such 
objects.  Evaluation  of  this  possibility,  however,  awaits  test. 

What  does  a  moth  do  upon  the  detection  of  a  bat?  What  a  moth  does, 
upon  detecting  a  bat,  is  not  easy  to  state  concisely.  The  maneuvers  executed 
by  moths  seem  both  diverse  in  their  nature  and  erratic  in  their  timing. 
Roeder  states,  “The  difficulty  experienced  in  classifying  these  non-direc- 
tional  evasive  movements  is  perfectly  significant  in  the  biological  situation, 
since  it  may  be  expected  to  tax  the  prediction  powers  of  the  predatory  bats 
as  well  as  those  of  the  experimenter”  (27,  p.  57).  Some  of  the  more  typical 
responses  of  evading  moths  are  shown  in  Figures  3  and  4.  Others  are 


bats  were  detected,  the  lens  of  the  camera  was  opened  and  a  Sylvania  Sun-Gun  was 
activated.  The  pictures  were  made  with  the  collaboration  of  A.  E.  Treat,  using  the 
technique  suggested  by  K.  D.  Roeder.  The  middle  and  lower  portions  of  the  Figure 
appear  in  a  publication  by  Roeder  (28)  and  are  reproduced  here  by  permission.  Due 
to  the  absence  of  a  time  marker,  the  time  relations  cannot  be  specified  accurately. 
The  top  third  of  the  Figure  illustrates  the  capture  of  a  moth  immediately  after  the 
moth’s  initiation  of  a  dive  or  diving  spiral.  The  middle  third  shows  a  remarkably 
precise  predictive  evaluation  of  the  moth’s  horizontal  spiral.  The  bottom  third  of  the 
Figure  demonstrates  successful  evasion  by  the  moth  with  the  use  of  a  sudden  sec¬ 
ondary  loop-back.  Apparently  expecting  the  downward  dive  to  be  sustained,  the  bat 
continued  its  high  speed  course  without  deflection  when  the  loop-back  occurred. 


62 


Man-Machine  Systems 


Figure  4  Continuing  Evasive  Tactics  by  Moths.  The  tracks  of  two  moths  are  shown 
in  this  streak  picture  (made  in  conjunction  with  A.  E.  Treat).  Arrow  points  to  the 
sudden  initiation  of  looping  evasive  tactics  by  one  of  the  moths.  Note  the  apparent 
randomness  of  the  direction  but  not  of  radius.  Other  pictures  have  shown  sudden 
variations  in  turn  radius  and  in  other  attributes  of  the  evasive  pattern. 


63 


Bat  and  Ultrasonic  Principles  II 

presented  by  Roeder  and  Treat  (30,  p.  145).  Sudden  dives,  often  with 
sharp  loops  back,  occur  frequently  with  a  bat’s  close  pursuit.  Variously 
oriented  spirals  of  constant  or  varying  radius  are  also  seen  in  the  presence 
of  bats.  But  loops,  sudden  shifts  of  direction,  spurts  of  speed,  and  abrupt 
cessations  of  flight — often  with  unpredictable-seeming  time  relations — 
occur  in  so  many  combinations  that  the  bat’s  prediction  problem,  certainly 
as  we  see  it,  must  often  be  extremely  difficult. 

Yet  if  long  evolutionary  development  has  shaped  the  evaluating  me¬ 
chanisms  of  the  bat,  the  bat’s  analytical  system  may  incorporate  some 
very  effective  probability  indicators  that  permit  it  to  judge,  better  than  we 
as  outside  observers,  the  zones  where  an  evading  moth  is  likely  to  go. 
Certain  observations  by  Roeder  and  by  Treat  are  suggestive;  namely,  that 
the  reaction  times  of  moths  (particularly  with  reference  to  these  so-called 
“non-directional”  maneuvers)  ranged  from  one-fifth  of  a  second  to  one 
second.  In  this  amount  of  time,  the  bat  accomplishes  a  large  proportion  of 
its  interception.  If  the  moth’s  maneuvers  are  triggered  by  particular  changes 
in  the  bat’s  signals,  then  the  bat  has  a  significant  interval  during  which  the 
moth  is  likely  to  continue  the  evasive  tactics  already  initiated. 

That  evasion  works,  however,  has  been  quantitatively  demonstrated 
by  the  compilations  of  Treat.  Using  Treat’s  figures,  Roeder  and  Treat  (32) 
demonstrated  an  enormous  advantage  in  terms  of  likelihood  of  escape  for 
moths  which  evaded,  as  against  those  which  did  not.  Approximately  half 
of  the  observed  attempts  on  nonreacting  moths  resulted  in  catches,  whereas 
only  about  1/14  or  7  percent  of  the  observed  attempts  on  reacting  moths 
were  successful.  Little  systematic  observation  of  evading  moths  has  been 
made  with  bats  other  than  Lasiurus  and  Myotis;  moreover  the  observed 
results — as  clearly  pointed  out  by  the  authors — cannot  be  taken  as  the 
total  picture  of  evasive  advantage.  In  the  laboratory,  a  few  Galleria  moths 
were  flown  in  the  presence  of  Myotis  lucifugus  and  of  Plecotus  townsendii. 
The  moths  would  quickly  land,  or  not  fly  at  all,  when  Myotis  were  active. 
They  were  commonly  quite  willing  to  fly,  however,  when  Plecotus  were 
flying.  Indeed,  Plecotus  bats  sometimes  seemed  able  to  come  up  behind 
the  moths  without  producing  evasive  tactics,  or  producing  them  only  at  the 
very  last  instant.  Possibly  the  very  low  signal  level  of  Plecotus,  or  the  low 
level  combined  with  a  lower  frequency  range  (12),  causes  the  signals  to 
fall  below  threshold  for  the  excitation  of  evasive  tactics.*  At  the  other 
extreme,  the  high  flight  speed  of  the  red  bat  may  be  partly  designed  to  get 

*  Or  it  could  be  that  the  moth  interprets  the  low  signal  level  as  if  the  bat  were 
at  a  greater  distance;  and  consequently  takes  directional  action  (e.g.,  turning  away 
from  the  bat),  but  does  not  initiate  violent  evasive  maneuvers. 


64  Man-Machine  Systems 

it  to  the  intercept  point  before  the  moth  has  a  chance  to  change  its  tactics 
significantly.  Here  the  predictive  powers  of  the  bat,  and  its  techniques  of 
capture,  tend  to  outwit  the  evasive  skills  of  the  moths.  At  the  present  time, 
observations  are  few  and  speculations  many.  For  the  most  part,  the  true 
story  remains  to  be  discovered. 

The  Ultrasonic  Signals  of  Moths.  The  escape  techniques  of  a  few  moths 
may  not  be  limited  to  evasive  maneuvers  for  among  the  counter  tactics  that 
certain  limited  evidence  seems  to  implicate  is  the  production  of  ultrasonic 
pulses,  perhaps  in  response  to  the  orientation  sounds  of  bats.  The  sound 
making  mechanism  of  two  arctiid  moths  have  recently  been  described  by 
Blest,  Collett,  and  Pye  (2).  Moreover,  Roeder  (personal  communication) 
found  that  the  production  of  pulses  by  Halysidota  moths  could  sometimes 
be  set  off  or  terminated  by  bat-type  pulses,  the  effects  often  being  related  to 
pulse  intensities.  Though  present  evidence  is  extremely  limited,  it  suggests 
that  definite  relations  may  exist  between  the  sets  of  signals  on  the  two  sides. 

What  actions  do  the  moth’s  pulses  have  upon  the  pursuit  actions  of  bats? 
One  set  of  tests,  carried  out  by  A.  E.  Treat  with  red  bats  and  Halysidota 
moths  at  Tyringham,  Massachusetts,  produced  some  striking,  if  limited, 
evidence.  Various  kinds  of  moths  were  being  tossed  by  Treat  into  the  ap¬ 
proach  paths  of  red  bats,  Lasiurus  borealis.  In  almost  all  cases,  the  bats 
either  captured  the  moths  or  made  serious  attempts  at  pursuit.  However, 
with  a  particular  kind  of  moth,  Halysidota,  one  set  of  results  was  strikingly 
different.  Out  of  about  a  dozen  tosses  of  Halysidota  the  bats  appeared  to 
veer  away  in  all  but  one  instance.  In  subsequent  tests,  Treat  found  the  re¬ 
sults  to  be  less  striking  (personal  communication);  but  the  observed  tests 
left  no  reasonable  doubt  that  under  certain  conditions  Halysidota  moths 
were  selectively  avoided  by  the  red  bat.  Whether  the  moths  were  actually 
emitting  sounds  at  the  time  of  the  test  was  not  established;  hence,  it  is 
possible  to  reason  that  features  other  than  sound  emission  may  have  caused 
evasion  by  the  bats. 

Unfortunately,  at  the  time  Halysidota  moths  were  available  for  tests 
in  the  laboratory,  red  bats  were  not,  and  vice  versa.  Tests  were  therefore 
carried  out  with  the  little  brown  bat,  Myotis  lucifugus.  Here  the  results  were 
rather  variable,  Halysidota  moths  sometimes  being  caught  with  probabilities 
similar  to  those  of  the  other  moths  tested,  at  other  times  being  largely 
avoided.  However,  the  tiger  moth,  Apantesis  virgo  (and  perhaps  one  or 
two  other  Apantesis  moths),  appeared  to  produce  invariable  avoidance 
during  a  very  limited  set  of  tests.  In  the  course  of  a  series  of  tests  covering 


65 


Bat  and  Ultrasonic  Principles  II 

a  total  of  61  tosses  of  various  targets,  including  14  of  active  tiger  moths, 
only  the  tiger  moth  was  always  avoided.  Figure  5  illustrates  the  results  of 
all  tests  which  included  tosses  of  tiger  moths.  As  shown  in  this  figure, 
when  moths  other  than  Halysidota  or  tiger  moths  were  tossed,  16  out  of 

RESULTS  FOR  SILENT  AND  SOUND-EMITTING  TARGETS 


=  CONTACT  (Catch,  Catch  A  drop. 
Hit.) 


=  AVOID 


SILENT  SOUND-EMITTING 


mealworms  moths 


Figure  5  Interception  Results  (with  Myotis  lucifugus )  for  Silent  Versus  Sound 
Emitting  Targets.  The  two  silent  target  groups  consisted  of  a)  mealworms  and  b) 
silent  moths  (i.e.,  those  giving  no  evidence  of  significant  sound  production  when 
manipulated  within  about  an  inch  of  the  microphone).  The  two  sound  emitting 
groups  comprised  a)  Halysidota  tessellaris  and  b)  one  or  two  closely  related  species 
of  Apantesis,  or  tiger  moths,  including  Apantesis  virgo.  Thirty-five  of  the  37  tosses  of 
silent  targets  resulted  in  definite  attempts  at  capture,  whereas  only  6  of  the  24  tosses  of 
sound  emitting  targets  resulted  in  such  attempts.  Actually,  the  Halysidota  moths  were 
not  tested  for  sound  emission  in  the  experimental  situation.  No  active  tiger  moths 
(out  of  14  tosses)  were  attempted.  One  tiger  moth  was  projected  about  a  dozen  times 
in  front  of  a  microphone  and  in  each  case  produced  clicks. 


20  were  caught  or  hit  (and  all  but  one  apparently  attempted),  yet  the  14 
tosses  of  active  tiger  moths  gave  rise  to  no  attempts.  Results  with  Halysidota 
moths  were  in  between.  Lack  of  suitable  instrumentation  prevented  discov¬ 
ery  at  this  initial  stage  of  whether  the  moths  under  test  were  actually  emitting 
clicks  or  pulses.  A  very  limited  supply  of  moths  also  prevented  any  really 
adequate  collection  of  data. 


66  Man-Machine  Systems 

Certain  tests,  however,  were  made  on  the  sound  emission  of  the  moths 
used,  including  several  Halysidota  and  several  tiger  moths.*  These  tests 
suggested  that  the  production  of  clicks  by  Halysidota  moths  was  somewhat 
less  reliable  than  the  production  of  clicks  by  tiger  moths.  In  all  cases  where 
active  tiger  moths  were  tossed  in  front  of  the  microphone,  clicks  were 
noted.  In  the  case  of  one  moth  and  one  bat,  an  effort  was  made  to  see  if 
the  occurrence  of  clicks  in  the  moth  was  related  to  the  avoiding  action  of 
the  bat.  In  this  one  case,  a  tiger  moth  was  projected  upward  four  times 
and  each  time  was  avoided  by  the  Myotis  bat  under  test  (Figure  6);  it  also 


Figure  6  Reactions  of  Myotis  lucifugus  to  Tiger  Moths:  Avoidance  of  Active  Tiger 
Moths.  Note  the  curling  back  of  the  bat’s  ears  and  the  slight  closure  of  the  mouth, 
often  noted  when  bats  relinquish  pursuit. 


always  produced  clicks  when  projected  close  to  a  microphone  between 
tosses  to  the  bat.  In  the  course  of  a  long  series  of  tests,  significant  clicks 
finally  disappeared.  At  this  point,  the  moth  was  again  projected  by  catapult 
to  the  same  bat.  The  bat  immediately  made  an  attempt,  and  caught  success¬ 
fully  on  the  second  try  (Figure  7).  Since  the  moth  was  also  less  active  at 
this  juncture,  the  catch  could  have  been  due  to  the  simpler  nature  of  the 
interception.  During  the  test,  however,  this  bat  was  making  serious  pur¬ 
suits  of  other  moths,  regardless  of  their  degree  of  activity,  suggesting  that 
sound  emission  rather  than  flight  activity  was  the  key  factor.  While  no 


*  Sound  emission  was  tested  chiefly  under  the  following  four  conditions:  1)  hold¬ 
ing  the  moth  by  the  wings  (thus  eliminating  possible  sound  production  from  the 
wing  action);  2)  rolling  the  moth  slowly  around  on  the  inside  of  a  jar;  3)  dropping 
the  moth  in  front  of  a  microphone;  and  4)  projecting  the  moth  upward  in  front  of 
a  microphone  by  use  of  the  same  gun  that  was  used  for  projecting  targets  to  bats. 


67 


Bat  and  Ultrasonic  Principles  n 

definite  conclusion  can  be  drawn  from  this  single  instance,  there  is  again 
the  suggestion  that  the  bat’s  pursuit  may  have  been  significantly  influenced 
by  the  ultrasonic  pulses  of  the  moth.  Taken  together,  the  several  instances 
seem  to  warrant  the  guess  that  certain  moths  may  sometimes  escape  capture 
by  emitting  ultrasonic  pulses  when  pursued. 


Figure  7  Reactions  of  Myotis  lucifugus  to  Tiger  Moths:  Catch  of  Inactive  Tiger 
Moth.  Tests  had  indicated  absence  of  significant  clicks  prior  to  this  toss. 


Accepting  for  the  moment  the  speculation  that  moths  emit  ultrasonic 
pulses  to  escape  capture  by  bats,  we  must  next  inquire  as  to  how  the  moth- 
emitted  pulses  act  upon  the  bat.  Do  they  warn  the  bat  of  some  obnoxious 
or  dangerous  attribute?  Are  they  pseudowaming  signals  mimicking,  perhaps, 
the  signals  of  some  rare  or  remote  target  with  dangerous  attributes,  or 
might  they  act  chiefly  to  jam  the  bat  or  to  produce  clutter  which  the  bat 
could  not  easily  resolve?  They  might,  for  example,  be  triggered  singly  or  in 
groups  so  as  to  give  phantom  echoes,  or  echoes  producing  false  impressions 
of  position,  path,  or  velocity.  At  this  early  stage  we  do  not  know.  Thus 
much,  however,  we  can  say:  Halysidota  moths  represent  one  of  several 
species  possessing  fine  spurs  and  claws  which  project  from  long  and  spindly 
legs.  These  features  enable  such  moths  to  cling  obnoxiously  to  a  bat. 
Myotis  bats,  for  the  most  part,  prevent  such  clinging  by  their  techniques 
of  catch  (Figure  8).  But  occasionally  their  procedure  fails;  and  under  such 
conditions  they  have  been  observed  to  struggle  for  a  significant  portion  of 
a  minute,  losing  altitude  and  expending  obviously  unusual  effort.  Red  bats 
commonly  use  a  different  procedure  for  catching,  and  they  often  hold  their 
targets  in  a  fur-lined  pouch  formed  by  the  interfemoral  membrane  and  the 
hind  legs  (Figure  9).  It  seems  likely  that  the  Halysidota  moths  would  thus 
have  a  much  better  chance  to  grasp  a  red  bat  than  a  Myotis  bat;  and  that 
they  might  indeed  constitute  a  menace  which  the  red  bat,  when  warned, 
would  assiduously  seek  to  avoid. 

The  riddle  of  the  tiger  moth  remains  with  few  clues  pointing  to  a  solu¬ 
tion.  It  is  true  that  some  moths  of  the  tiger  group  have  proved  unpalatable 


68  Man-Machine  Systems 

to  some  animals,  but  the  tiger  moths  under  test  were  eagerly  devoured  by 
Myotis  bats.  The  case  for  pure  jamming,  in  the  sense  that  the  bat’s  hearing 
of  echoes  from  the  moth  is  prevented  by  the  moth’s  pulses,  seems  unlikely. 
Existing  evidence  suggests  that  jamming  is  difficult,  even  by  wide  band 
noise  within  the  bat’s  frequency  range,  once  a  bat  has  locked  onto  its  target. 
Evaluation  of  obstacles  in  the  presence  of  very  high  noise  levels  has  also 


Figure  8  Catch  of  Halysidota  by  Myotis  lucifugus.  The  moth  is  normally  quickly 
seized,  as  here,  so  that  the  legs  point  outward  and  cannot  grasp  the  bat.  Occasion¬ 
ally  such  moths  succeed  in  grasping  the  bat  and  cause  obvious  distress. 


Bat  and  Ultrasonic  Principles  n 


69 


Figure  9  Pouching  Technique  of  Lasiurus  borealis.  The  red  bat  (Lasiurus  borealis ) 
frequently  holds  its  prey  momentarily  in  the  deep  pouch  formed  by  the  interfemoral 
membrane  and  the  hind  legs.  Apparently  the  bat  thus  has  a  chance  to  reorient  itself 
before  seizing  the  prey  with  its  mouth.  The  use  of  this  technique,  together  with  the 
fur  that  surrounds  the  membrane,  may  enhance  the  moth’s  chance  of  gaining  a  hold. 
For  the  red  bat  catching  Halysidota  the  problem  may  be  more  serious  than  is  true 
for  My otis  and  Eptesicus.  (Insert  gives  a  detailed  representation  of  the  top  image 
of  the  three-flash  sequence  showing  the  catch.  Wing  of  moth  projects  out  of  pouch.) 


70  Man-Machine  Systems 

been  amply  demonstrated  (11,  12).  The  possibility  of  phantom  or  mis¬ 
leading  pulse  configurations  remains  open.  Experiments  have  shown  that 
many  Myotis  bats  avoid  clusters  of  targets,  and  targets  that  are  too  suddenly 
displaced.  But  other  Myotis  bats,  and  certainly  the  one  red  bat  thus  far 
tested,  are  extraordinarily  proficient  at  selecting  one  out  of  many  moving 
targets  in  a  cluster  (Figures  10,  11,  and  12).  One  thing,  however,  does 
seem  to  interfere  with  the  pursuit  procedure  of  most  bats;  namely,  the  ex¬ 
istence  of  one  large  target  and  one  or  more  near-by  smaller  targets.  Though 
highly  unlikely,  it  is  conceivable  that  the  moth’s  pulses  could  create  some 
such  impression.  At  this  stage,  however,  all  such  ideas  are  purely  specula¬ 
tive  and  serve  only  to  indicate  some  of  the  directions  the  search  may  take. 


Figure  10  Selection  of  One  Target  Out  of  a  Cluster:  Catch  of  One  Out  of  Eight 
Mealworms  by  Myotis  lucifiigus.  Two  flashes  occurred  while  the  bat  was  out  of  the 
field;  the  lowest  group  of  eight  mealworms  correspond  to  the  second  flash.  (In  the 
upper  two  overlapping  images,  the  bat  is  reaching  into  its  tail  membrane  to  seize  the 
captured  prey.) 


Bat  and  Ultrasonic  Principles  n 


71 


Figure  11  Selection  of  One  Target  Out  of  a  Cluster:  Catch  of  One  Out  of  Fifteen 
Mealworms  by  Lasiurus  borealis.  At  the  last  image  (left),  the  mealworm  is  in  the 
bat’s  pouch.  On  the  next  circuit  by  the  bat,  25  mealworms  were  tossed  and  the  bat 
turned  away. 


The  final  aspect  of  the  moth’s  system  deserving  mention  is  the  mecha¬ 
nism  of  sound  production.  The  sound  generating  mechanism  of  Halysidota 
is  illustrated  in  Figure  13.  Studied  intensively  by  Blest,  Collett,  and  Pye 
(2),  the  mechanism  was  found  to  operate  much  in  the  manner  of  an  array 
of  toy  clickers  or  crickets.  By  pounding  dents  into  a  strip  of  spring  steel  and 
then  bending  the  strip  so  that  the  dents  popped  out  and  releasing  it  so  the 
dents  popped  in,  these  investigators  generated  sounds  which  were  virtually 
indistinguishable  from  the  pulses  of  the  moths  when  slowed  down  with  the 
use  of  a  variable-speed  tape  recorder.  Similar  tymbal  organs  have  been 
noted  on  the  metepisternum  of  many  arctiid  moths.  In  the  tiger  moth, 
Apantesis  virgo,  however,  no  row  of  dents  or  creases  was  evident,  in 
keeping  with  the  observation  that  the  pulses  of  the  tiger  moth  tended  to  be 
single  or  paired  rather  than  grouped.  A  comparison  of  the  pulses  and  pulse 
sequences  of  Halysidota  and  tiger  moths  is  given  in  Figures  14  and  15.  The 


72 


Man-Machine  Systems 


Figure  12  Selection  and  Catch  of  Mealworm  from  Three  Spheres  by  Lasiurus 
borealis.  Two  of  the  spheres  approximated  the  average  reflectance  of  a  mealworm; 
the  other  was  smaller.  Placed  above  the  path  of  the  bat  is  the  pulse  sequence  for  this 
catch.  Connecting  lines  show  the  approximate  relation  of  approach  path  to  emitted 
pulses.  The  reason  for  the  terminal  change  in  angle  of  these  lines  (end  of  sequence 
being  indicated  by  heavier  broken  line)  is  the  sudden  slowing  down  by  the  bat  just 
before  the  catch.  In  Figures  33,  34,  35,  and  36  the  relations  of  signal  to  action  are 
given  in  more  detail. 


Figure  13  Tymbal  Organ  of  Halysidota  tessellciris.  Upper  portion  of  Figure  shows 
position  of  tymbal  organ  (see  arrow)  below  wing  bases,  with  tympanic  opening  show¬ 
ing  as  dark  spot  just  posteriorly.  At  right  side  is  eye  with  antenna  crossing  it.  Lower 
portion  of  Figure  shows  mirotymbals.  They  appear  as  a  row  of  curved  creases  or 
grooves  in  the  hard  shelled  and  hollow  metepisternal  sclerite.  When  the  row  is  bent 
by  muscular  action  the  grooves  pop  out,  producing  a  set  of  clicks.  Another  set  of 
clicks  occurs  as  the  grooves  pop  in  again  with  relaxation  of  the  muscle  (see  Reference 
2  for  details). 


74 


Man-Machine  Systems 


Figure  14  Pulse  Sequences  and  Pulse  Groupings  of  Apantesis  and  Halysidota 
Moths:  Apantesis  (Tiger  Moth)  Pulses.  The  upper  pair  of  traces  shows  a  typical 
sequence  of  pulses  recorded  from  a  moth  that  is  being  handled  within  an  inch  or  two 
of  the  microphone.  Individual  pulses  are  numbered,  and  four  groups  of  interest  are 
designated  by  letters.  The  enlarged  tracing  just  below  shows  a  pair  of  pulses  similar 
to  the  pair  labelled  A.  Along  the  side  of  this  tracing  is  a  frequency  calibration  which 
refers  to  the  output  of  a  zero-crossing  meter.  Intervals  between  crossings  appear  as 
dots.  In  the  present  pair  of  pulses  the  crossings  correspond  to  a  clustering  of  fre¬ 
quency  components  around  45  to  70  kc  for  the  first  pulse,  and  roughly  35  to  50  kc 
for  the  second.  Pulse  durations  are  roughly  3  milliseconds.  Details  of  the  frequency 
structure  cannot  readily  be  seen  in  the  upper  tracings.  However,  the  different  fre¬ 
quency  clusterings  are  readily  observable.  Thus,  group  B  (Numbers  11  through  14) 
shows  pulses  alternating  between  higher  and  lower  frequencies,  but  the  second  pair 
(13  and  14)  shows  a  wider  frequency  scatter  than  the  first  pair  (11  and  12).  In 
groups  C  and  D  the  ordering  of  the  frequency  groups  is  different:  two  lower  fre¬ 
quency  pulses  are  followed  by  two  higher  frequency  pulses  in  each  case.  Presumably 
the  two  frequency  ranges  correspond  to  the  tension  and  relaxation  modes  of  sound 
generation;  the  smaller  and  larger  frequency  scatters  are  associated  with  left  and  right 
positions.  However,  this  moth  has  the  capacity  to  control  various  features  of  its 
pulses,  most  notably  pulse  length.  By  cutting  down  the  resonance  of  the  terminal  por¬ 
tion  of  the  pulse,  apparently,  the  length  can  be  reduced  to  one-third  millisecond  or  less. 
Such  abbreviated  pulses  are  also  recorded  when  the  tymbal  organs  are  cut  open  (see 
insert  at  right).  The  moth  may  indeed  have  some  control  over  frequency  and  fre¬ 
quency  scatter  of  the  individual  pulses,  as  also  of  their  duration. 


Bat  and  Ultrasonic  Principles  II 


75 


Figure  15  Pulse  Sequences  and  Pulse  Groupings  of  Apantesis  and  Halysidota 
Moths:  Halysidota  Pulses.  The  conditions  and  arrangements  are  the  same  as  for  the 
previous  Figure,  with  the  exception  of  the  numbering.  Here,  each  number  specifies 
a  total  tension-relaxation  group  of  pulses.  The  enlarged  tracing  is  of  the  first  pair  of 
pulse  groups  (A),  where  the  initial  phase  of  operation  of  the  sound  mechanism  pro¬ 
duced  six  pulses  and  the  ensuing  phase  produced  six  still  more  closely  spaced  pulses. 
An  isolated  pulse  occurred  shortly  afterward,  and  other  more  or  less  isolated  pulses 
occur  in  the  regions  labelled  B  and  C.  The  longest  group  in  the  series,  labelled  D, 
contains  eleven  pulses  in  each  phase.  Much  variation  in  pulse  numbers  and  spacings 
occurs  through  the  record.  In  the  enlarged  tracing  it  is  evident  that  the  frequency 
components  are  concentrated  close  to  80  kc,  with  little  evidence  in  this  record  of  a 
lower  fundamental  (as  noted  in  Reference  2).  There  is  also  little  of  the  frequency 
scatter  seen  in  the  pulses  of  Apantesis.  Indeed,  the  slowed  down  pulses  of  Halysidota 
sound  much  like  the  to-and-fro  tinkling  of  a  slightly  rusty  bicycle  bell,  while  those 
of  Apantesis  sound  very  similar  to  corks  being  pulled  out  of  a  couple  of  bottles  in 
sequence.  Individual  pulses  of  Halysidota  are  mostly  of  one-third  to  one-half  milli¬ 
second  in  duration,  while  the  total  two-phase  grouping  typically  runs  from  about  8 
msec  (in  Number  9)  to  about  16  msec  (in  Number  19). 


story,  however,  appears  to  be  more  complex  than  the  simple  tymbal  organ 
suggests;  for  abbreviated  pulses  were  readily  detectable  even  after  complete 
longitudinal  cutting  of  the  outer  tymbal  surfaces.  It  is  thus  possible  that  two 
or  more  sound  producing  mechanisms  or  modes  of  operation  contribute  to 
the  click  complex  of  these  moths  and  give  rise  to  a  versatility  of  sound 


76  Man-Machine  Systems 

emission  that  produces  more  than  one  effect  upon  the  auditory  reception 
of  bats. 

The  properties  of  the  targets  and  the  implications  for  their  bat  predators 
can  perhaps  be  summarized  as  follows.  The  flying  targets  caught  by  bats 
exhibit  a  great  diversity  of  size,  probably  from  about  one-fifth  of  a  milligram 
to  over  200  milligrams — a  range  of  1000  times  or  more.  The  evidence 
suggests  that  although  many  bats  may  not  select  the  very  small  insects 
habitually,  they  probably  can  when  necessary.  The  signals  and  techniques  of 
bats  appear,  for  the  most  part,  to  provide  for  such  contingencies.  Yet,  as 
mentioned  earlier,  the  signals  and  techniques  of  different  bats  may  be 
adapted  primarily  for  the  capture  of  particular  groups  of  insects.  Some  bats, 
for  example,  seem  to  specialize  in  catching  very  small  insects  in  rapid 
sequence,  whereas  others  seem  to  specialize  in  making  less  frequent  cap¬ 
tures  of  much  larger  and  more  difficult  targets.  Some  larger  targets,  notably 
certain  night-flying  moths,  have  evolved  special  methods  of  escape.  Many 
possess  sensitive  hearing  and,  when  pursued,  initiate  violent  evasive  maneu¬ 
vers  of  an  apparently  unpredictable  nature.  Others  evidently  send  out  ultra¬ 
sonic  pulses  themselves  in  response  to  the  pulses  emitted  by  the  bat,  these 
pulses  serving  in  several  different  ways  to  produce  cessation  of  the  bat’s 
pursuit.  With  this  background,  the  rationale  behind  the  bat’s  methods  may 
be  more  evident,  and  we  can  examine  in  a  little  more  detail  the  actual  pro¬ 
cedures  used  by  the  bats  and  relation  of  these  procedures  to  the  bat’s 
signals  and  to  the  action  of  the  targets. 


ANALYSIS  OF  OBSERVED  DATA 
ON  INTERCEPTIONS 

The  first  part  of  this  paper  has  been  devoted  to  a  survey  of  some  of  the 
main  features  of  insectivorous  bats,  their  echolocation  signals,  and  some 
special  aspects  of  their  survival  problems,  with  particular  reference  to  the 
interception  of  the  insect  targets  on  which  they  live.  The  second  part  will 
illustrate  a  few  specific  results  of  the  preliminary  observations  on  how  they 
catch,  how  accurate  their  aim  is,  and  how  they  deal  with  special  problems 
mentioned  in  the  sections  above.  The  questions  we  are  trying  to  answer 
are  roughly  these : 

(1)  How  do  bats  catch?  How  rapidly  do  they  catch?  How  accurate 
is  their  aim  and  how  reliable  their  technique? 


77 


Bat  and  Ultrasonic  Principles  II 

(2)  What  can  a  bat  judge  about  its  target  in  advance  of  capture?  For 
example,  can  it  identify  the  target  and  predict  what  the  target’s  action  is 
likely  to  be? 

(3)  How  are  the  bat’s  signals  related  to  the  interception  problem 
they  are  solving? 

In  the  past,  popular  belief  has  been  quick  to  overlook  the  more  obvious 
requirements  of  a  bat’s  interception  problems.  The  erratic-seeming  es¬ 
capades  of  bats  against  the  sky  have  commonly  been  considered  as  idle 
play  or  unexplained  perversity.  To  catch  the  insects  on  which  it  lived,  a 
bat  was  presumed  to  fly  straight  ahead  with  its  mouth  open,  scooping  in 
its  prey  from  insect  swarms  much  as  a  whale  scoops  plankten  from  the  sea, 
or  possibly  by  homing  in  on  the  steady  hum  of  a  straight-flying  mosquito 
or  beetle.  The  absurdity  of  the  idea  that  bats  could  feed  by  simply  flying 
straight  ahead  with  their  mouths  open  has  been  amply  illustrated  by  the 
calculations  of  Griffin  and  others  (5,  6,  7,  8,  14,  40).  A  bat  could  not 
obtain  even  a  thousandth  of  its  required  food  in  such  a  way.  Homing  in 
by  passive  listening  to  the  insect’s  hum,  while  it  may  sometimes  be  used, 
could  not  account  for  the  rapid  and  skillful  interceptions  that  have  been 
observed  on  essentially  silent  targets. 

Far  from  being  simply  the  aimless  and  erratic  dartings  of  a  playful 
creature,  virtually  every  plunge  or  zoom  may  end  in  a  precise  interception 
and  catch,  initiated  at  distances  that  may  vary  from  as  much  as  1  to  20 
feet  or  more.  Successive  catches,  moreover,  may  be  achieved  in  remark¬ 
ably  short  intervals  of  time.  It  is  clear  that  a  bat’s  signal  system  not  only 
permits  the  bat  to  perform  interceptions  on  difficult  targets  with  extra¬ 
ordinary  speed  and  precision,  but  that  it  also  enables  a  bat  to  avoid  serious 
mishap  and  to  survive  in  a  complex  environment  for  periods  as  long, 
perhaps,  as  25  years.  In  the  following  discussion,  we  look  at  some  of  the 
details  which  go  into  the  bat’s  extraordinary  performances. 

How  do  Bats  Catch? 

Preliminary  Observations.  Early  in  the  course  of  the  present  studies,  pre¬ 
liminary  observations  of  Myotis  lucifugus  bats  catching  mosquitoes  and 
fruit  flies  suggested  that  the  bats  caught  by  one  of  two  methods.  In  one 
case,  they  appeared  to  aim  at  a  straight-flying  target,  the  target  apparently 
entering  the  mouth  directly.  In  the  second  case,  the  target  appeared  to  be 
slightly  below  the  bat  or  slightly  off  to  one  side.  Here  the  bat  seemed  to 
scoop  it  up  with  the  tail  membrane,  balling  up  tightly  and  reaching  down 


78  Man-Machine  Systems 

with  its  mouth  to  pick  out  the  captured  insect.  Photographs  soon  revealed, 
however,  that  the  tail  membrane  was  always  brought  up  over  the  mouth  at 
the  time  of  capture,  but  that  in  some  cases  the  bat’s  mouth  did  indeed 
continue  directly  for  the  insect.  (Recent  high  speed  films  showing  details 
of  catches  by  Rhinolophus  jerrum-equinum  indicate  variable  use  of  the  tail 
membrane.  Extensive  use  is  made  of  one  or  both  wings,  with  direct  transfer 
from  wing  to  mouth  apparently  sometimes  occurring. )  Often  the  tail  mem¬ 
brane  was  brought  up  only  for  a  dozen  or  so  milliseconds,  apparently  either 
to  help  knock  the  insect  into  the  mouth  or  to  make  certain  the  insect  did 
not  escape  (as,  for  example,  in  case  of  a  small  error  of  aim  or  of  a  rapid 
last-instant  maneuver  by  the  target).  Additional  photographs  showed  that 
when  a  fruit  fly  was  off  to  one  side  it  was  sometimes  first  captured  in  the 
membrane  of  one  of  the  wings.  The  most  striking  feature  of  these  pictures, 
however,  was  the  very  precise  aim  of  the  head  and  ears  (Figures  17,  22, 
23,  and  25).  Almost  invariably  the  aim  was  so  precise  that  it  was  im¬ 
possible  to  detect  any  error,  at  least  from  the  pictures.  It  was  thus  clear 
that  the  bat  knew  exactly  where  the  fruit  fly  was  and  could  keep  its  head 
directed  toward  it  accurately  at  all  times.  Also  obvious,  however,  was  the 
fact  that  the  bat  could  not  always  get  its  body  to  the  correct  spot  and 
that  in  such  cases  it  appeared  to  reach  out  a  wing  for  initial  contact. 
Such  catches  did  not  necessarily  represent  an  error  of  aim.  Catches  of 
fruit  flies  are  shown  in  Figures  16,  17,  18,  and  19. 

Techniques  of  Catch  Used  by  Myotis  lucifugus.  To  photograph  the  details 
of  catches  in  the  laboratory,  a  somewhat  artificial  situation  was  used.  A 
solenoid-driven  “gun”  was  developed  (by  D.  A.  Cahlander).  This  gun 
projected  the  targets  upward  to  a  prescribed  zone  so  that  the  interceptions 
could  be  accurately  photographed.  The  initial  target  was  the  standard 
laboratory  food  of  bats,  namely  the  mealworm  (roughly  2 Vi  by  20  mm 
in  size).  Although  such  targets  moved  in  a  more  or  less  vertical  ballistic 
trajectory,  very  different  from  the  trajectory  of  fruit  flies,  the  bats  appeared 
to  use  very  much  the  same  techniques  of  catch  (40).  Certain  variations 
and  additions,  however,  were  noted.  The  techniques  observed  can  most  con¬ 
veniently  be  divided  into  three  somewhat  overlapping  categories  as  follows. 

Tail  Membrane  Catch  (Figure  20):  When  the  bat  is  able  to  make  a 
reasonable  evaluation  of  the  trajectory  of  a  ballistic  target,  and  can  readily 
direct  its  flight  path  to  the  intercept  point,  it  characteristically  attempts  a 
tail  membrane  catch.  Here  the  bat’s  aim  is  such  that  the  target  normally 
strikes  close  to  the  center  of  the  membrane  just  as  the  membrane  is  snapped 


Bat  and  Ultrasonic  Principles  n 


79 


Figure  16  Accuracy  of  Aim  by  Myotis  lucifugus.  Though  not  of  a  fruit  fly  catch, 
this  picture  of  the  catch  of  a  sphere  indicates  both  the  accuracy  of  the  aim  of  the 
bat’s  head  during  the  approach  and  the  extremely  precise  centering  of  the  target  at 
the  point  of  contact  with  the  tail  membrane.  Note  that  the  target  is  not  followed  with 
the  head  at  the  last  instant,  a  situation  commonly  noted  with  ballistically-falling 
targets.  Myotis  bats  have  trouble  evaluating  rapid  vertical  velocities,  and  during  early 
tests  typically  miss  targets  traveling  rapidly  upward  or  downward.  Accuracy  of  aim 
for  larger  targets  is  normally  established  at  a  distance  of  three  feet  or  more  (see 
Figure  25).  Some  of  the  pictures  indicate  like  accuracy  at  roughly  two  feet  when  the 
target  is  a  fruit  fly. 


forward  to  form  a  scoop  or  pouch.  Plead  and  membrane  are  then  brought 
together  and  the  bat  either  grasps  the  target  immediately  (coming  out 
again  in  as  little  as  one-twentieth  of  a  second)  or  spends  up  to  several 
seconds  adjusting  the  grasp  of  the  target  with  its  mouth. 

Wing  Scoop  Catch  ( Figure  21):  Very  commonly,  the  bat  finds  it  easier 


80  Man-Machine  Systems 

to  reach  out  for  the  target  with  its  wing.  Apparently,  however,  the  bat’s 
reflex  technique  of  seizure  makes  use  of  the  tail  membrane  and,  conse¬ 
quently,  the  target  always  appears  to  be  shovelled  into  the  pouch  of  this 
membrane  before  it  is  grasped  in  the  mouth. 

Wingtip  Catches  (Figures  22  and  23):  Sometimes,  particularly  with  a 
rapidly  moving  target,  the  bat  is  able  to  reach  the  target  only  with  (or 
close  to)  the  tip  of  its  outstretched  wing.  When  this  contingency  is  ex¬ 
pected,  the  bat  often  uses  a  rather  ingenious  method.  Prior  to  contact, 


Figure  17  Typical  Direct  Catch  of  a  Fruit  Fly  by  Myotis  lucijiigiis.  The  fly  is 
headed  downward  and  toward  the  bat’s  right,  the  bat  evidently  following  its  path 
with  precision.  At  the  center  image  (flash  number  2)  the  bat’s  tail  membrane  is 
moving  rapidly  forward,  and  presumably  scoops  in  the  fly  an  instant  later.  The  last 
flash  shows  the  bat  coming  out  of  the  catch  with  its  mouth  closed.  About  1/20 
second  after  this  stage  the  bat  is  normally  emitting  pulses  and  is  ready  for  another 
catch. 


Figure  18  Possible  Use  of  the  Wing  in  the  Capture  of  Fruit  Flies.  This  Figure 
shows  a  forward  reach  somewhat  similar  to  that  of  Figure  17,  but  in  this  case  with 
sharply  bent  wing  tip.  The  shadow  of  the  fly  shows  as  a  dark  spot  on  the  wing.  In  the 
second  image  the  bat  is  coming  out  of  its  catch. 


Bat  and  Ultrasonic  Principles  n 


81 


Figure  19  Possible  Use  of  the  Wing  in  the  Capture  of 
Fruit  Flies.  This  Figure  has  only  a  single  set  of  images, 
and  the  sequence  of  events  cannot  be  judged.  Late  de¬ 
tection  far  to  one  side,  however,  may  have  produced  the 
need  for  reaching  with  the  tip  of  the  wing. 


Figure  20  Tail  Membrane  Catch  of  a  Mealworm  by  Myotis  lucifugus.  Apparently 
the  quickest  and  simplest  method  of  catching  is  with  a  scooping  action  of  the  tail 
membrane.  The  bat’s  aim  is  commonly  good  enough  so  that  the  mouth  is  snapped 
directly  to  the  point  of  contact  of  the  target  and  membrane,  the  total  in/out  action 
sometimes  occupying  as  little  as  1/20  of  a  second.  Inadequate  localization  or  im¬ 
proper  grasp  of  the  target  may,  however,  lead  to  several  seconds  of  manipulation 
before  the  bat  straightens  out  for  normal  flight.  (Center  image  shows  the  mealworm 
in  the  tail  pouch.) 


82  Man-Machine  Systems 


the  bat  bends  the  tip  of  its  wing  sharply  over,  the  wing  being  moved 
in  such  a  manner  that  the  target  either  lodges  directly  in  the  groove  so 
formed  or  slides  there  after  contact.  Sometimes  the  bat  appears  to  make 
several  separate  motions  of  the  wing  while  attempting  to  snag  a  target 
correctly.  But  once  the  target  is  lodged  in  the  grove,  the  wing  is  pulled 


Figure  21  Wing  Scoop  Catch  by  Myotis  lucijugus.  With  fast-moving  and 
maneuvering  targets,  especially,  the  bat  apparently  often  finds  it  impos¬ 
sible  or  inconvenient  to  snare  the  target  directly  with  the  tail  membrane, 
and  certain  bats  may  actually  prefer  to  retrieve  most  targets  with  the  use 
of  a  wing.  Before  the  target  is  seized  with  the  mouth,  however,  it  is  normally 
dumped  into  the  forward-moving  tail  pouch  with  a  short  of  shovelling  action 
of  the  wing. 

rapidly  inward,  thus  bringing  the  target  into  the  pouch  formed  by  the 
forward-moving  tail  membrane.  The  insect  is  then  brought  to  the  mouth 
as  before.  Figure  24  illustrates  well  the  extreme  accuracy  of  aim  often 
achieved  with  these  wingtip  reaches.  Accuracy  of  localization  at  some 
distance  is  indicated  in  Figure  25. 

Catches  by  Other  Bats.  Catch  techniques  used  by  other  bats  have  also 
been  photographed.  By  and  large,  these  techniques  (with  some  variations  of 
emphasis  or  detail)  correspond  to  the  techniques  observed  with  Myotis 
lucifugus.  The  most  notable  exception  thus  far  seen  is  the  backward  som¬ 
ersault  catch  of  the  red  bat,  Lasiurus  borealis,  described  below.  Other 
catches  of  interest  are  the  following.  The  wingtip  catch  of  an  underwing 


Bat  and  Ultrasonic  Principles  II 


83 


Figure  22  Wingtip  Catches  by  Myotis  lucifugus.  Targets  threatening  to  go  quickly 
out  of  reach  often  result  in  a  somewhat  special  kind  of  wing  catch.  In  this  Figure, 
for  example,  a  falling  target  is  just  barely  reachable  with  the  tip  of  the  wing.  By 
bending  the  tip  sharply  over  in  advance  of  contact  the  bat  prevents  the  target  from 
slipping  off  the  end  of  the  wing.  A  rapid  inward  pull  then  puts  the  target  in  the 
tail  pouch. 


moth  by  the  Old  World  horseshoe  bat,  Rhinolophus  ferrum-equinum, 
described  elsewhere  (18),  is  of  chief  interest  because  it  illustrates  that  the 
very  different  signal  system  used  by  these  bats  still  permits  extreme  ac¬ 
curacy  of  aim.  Recent  films  have  supported  previous  observations  indicating 
extensive  use  of  the  wing  membranes  to  execute  catches,  but  have  also 
shown  that  use  of  the  tail  membrane  is  sometimes  rather  similar  to  that 


84 


Man-Machine  Systems 


Figure  23  Wingtip  Catches  by  Myotis  lucifugus.  Here  we  see  a  situation  similar  to 
that  of  Figure  22,  with  a  target  rising  rapidly  above  the  bat.  Shadow  indicates  that 
the  mealworm  is  just  making  contact  with  the  wing. 

seen  in  Myotis.  Myotis  keenii,  a  close  relative  of  Myotis  lucifugus,  appears 
to  specialize  in  slow  flight  and  extremely  quick  maneuvering.  Three  at¬ 
tempts  at  a  target  are  sometimes  made  within  one  second.  The  large  brown 
bat,  Eptesicus  fuscus,  has  not  yet  made  catches  of  mealworms  in  the  labora¬ 
tory.  A  few  pictures  made  outdoors  suggest  that  its  techniques  of  catch 
are  very  similar  to  those  of  Myotis  lucifugus,  but  that  its  flight  speed  is 
higher.  It  appears  to  achieve  very  accurate  aim  toward  its  target  at  dis- 


Bat  and  Ultrasonic  Principles  n  85 


Figure  24  Wingtip  Hit  of  Small  Sphere  by  Myotis  lucifugus.  Extreme  accuracy  of 
localization  of  a  3  mm  sphere  is  obvious  in  this  picture.  During  the  process  of  capture 
such  small,  hard  targets  tend  to  bounce  around  in  the  membranes  and  are  often  lost 
(as  happened  here)  before  seizure  with  the  mouth.  If  caught,  such  targets  may  be 
chewed  for  some  time  before  being  recognized  as  inedible. 


tances  of  6  feet  or  more.  One  difference  noted  with  Eptesicus  bats  is  the 
apparent  use  of  passive  listening  in  the  preliminary  location  of  certain  tar¬ 
gets.  These  bats  sometimes  appear  to  be  attracted  to  single  buzzing  insects 
and  to  swarms  of  smaller  insects,  even  when  not  emitting  pulses  themselves. 
Whether  or  not  they  can  home  in  on  these  sounds  and  achieve  captures 
without  the  use  of  the  usual  terminal  echolocating  signals  has  not  been 
established.  Were  this  true,  it  would  be  interesting  to  know  whether  they 
also  homed  in  on  some  of  the  noise-making  moths  which  might  be  ob- 


86 


Man-Machine  Systems 


Figure  25  Accuracy  of  Aim  at  a  Distance.  Myotis  lucifugus  typically  follow  their 
targets  very  accurately  with  the  aim  of  their  heads,  from  a  distance  of  3  feet  (as 
here)  or  more.  Similar  aim  at  a  distance  of  about  6  feet  has  been  observed  out  of 
doors  with  Eptesicus  fuscus. 


87 


Bat  and  Ultrasonic  Principles  II 

noxious  to  other  bats  but  suitable  food  targets  for  .Eptesicus.  No  systematic 
investigation  has  thus  far  been  carried  out  on  the  manner  in  which  such 
bats  might  combine  the  use  of  passive  listening  with  echolocation.* 

As  mentioned  above,  the  most  notable  exception  to  the  catch  techniques 
of  Myotis  lucifugus  is  seen  in  the  somersault  catch  of  the  red  bat,  Lasiurus 
borealis.  This  bat  has  been  observed  to  use  all  the  techniques  seen  in  Myotis 
bats,  though  often  with  what  appears  to  be  a  higher  order  of  predictive 
skill  (note  anticipative  aim  in  Figures  27  and  28).  In  a  significant  pro¬ 
portion  of  cases,  also,  the  red  bat  uses  a  different  technique.  The  bat  often 
approaches  as  though  it  were  going  to  fly  underneath  the  target  (Figure 
26);  moreover  unlike  Myotis,  it  may  not  follow  the  target  at  all  with  its 
head  and  ears.  About  the  time  the  red  bat  reaches  the  target,  for  a  somer¬ 
sault  catch,  it  suddenly  swings  its  tail  membrane  around  in  a  complete  loop 
of  360  degrees,  typically  bringing  it  to  zero  air  velocity  at  the  midpoint 
of  the  somersault.  By  moving  its  wings  forward  and  together,  it  produces 
a  sort  of  funnel  leading  into  a  deep  pouch  formed  by  the  hind  legs  and  the 
tail  membrane.  Not  uncommonly,  also,  a  target  off  to  one  side  is  shifted 
into  the  middle  of  this  funnel  by  the  bringing  together  of  the  wings,  or 
sometimes  by  a  shovelling  action  of  one  wing  somewhat  similar  to  that 
seen  in  Myotis.  With  the  target  lodged  in  the  pouch,  the  bat  comes  out  of 
its  somersault,  reoirents  itself,  then  reaches  down  into  the  pouch  to  seize 
its  food.  The  evidence  gathered  with  the  only  red  bat  tested  in  the  labora¬ 
tory  suggests  that  this  technique  is  more  successful  in  the  retaining  of 
evasive  targets  and  certain  very  small  targets.  One  would  be  tempted  to 
guess  that  the  moving  together  of  the  wings  and  the  rapid  sweeping  action 
of  the  tail  membrane  through  the  slot  or  funnel  so  formed  was  well  de¬ 
signed  for  the  capture  of  evading  moths,  where  the  moth’s  last-instant 
maneuvers  may  prevent  precise  aim  by  the  bat  during  the  final  phase  of 
pursuit. 

Certain  features  of  the  red  bat’s  performance  deserve  mention.  For 
example,  the  one  red  bat  tested  has  proved  extremely  proficient  at  the 
selection  of  one  target  from  a  cluster  of  as  many  as  a  dozen  or  more  other 
near-by  targets  (Figures  10,  11,  and  12).  The  precise  and  rapid  retrieval 


*  Of  considerable  interest,  with  reference  to  passive  listening,  is  the  work  of 
Payne  on  the  accuracy  of  strikes  by  the  barn  owl.  Rustling  sounds,  which  include 
the  requisite  frequency  band,  permit  the  owl  to  localize  a  sound  source  to  within 
one  degree  of  angle  (see  Reference  21,  particularly  pp.  34-37  and  59-64).  Evidence 
for  accurate  localization  with  high  frequency  sound  components  by  human  beings 
has  been  given  by  Batteau  (1).  See  also  Mills  (19). 


88  Man-Machine  Systems 

of  the  selected  target  from  amongst  several  others,  closely  observed  in  high 
speed  films,  indicates  clearly  that  the  bat  did  not  simply  fly  into  the  cluster 
and  grab  what  it  happened  to  hit.  Another  notable  feature  of  the  red  bat’s 
interceptions  is  its  capacity  to  intercept  a  rapidly  falling  target,  even  near 
the  ground.  Its  success  at  some  of  these  high  speed  captures  appears  to 
derive  from  three  things : 


Figure  26  Somersault  Catch  by  Lasiurus  borealis.  To  show  details  of  the  action 
this  sequence  was  made  at  20  flashes  per  second  instead  of  the  usual  10.  Finely 
broken  line  indicates  path  of  the  bat’s  tail,  while  coarsely  broken  line  indicates  path 
of  the  bat’s  head.  Half  a  foot  to  a  foot  from  the  intercept  point  the  bat  puts  its 
wings  out  and  forward  to  provide  counter-resistance  for  the  rapid  sweeping  action  of 
the  tail  which  initiates  the  backward  somersault.  The  wings  are  then  brought  to¬ 
gether,  still  extended,  and  the  tail  membrane  is  formed  into  a  deep  pouch  into 
which  the  target  slides  (if  not  initially  caught  there).  Both  wingtips  are  sometimes 
bent  over,  meeting  in  the  middle,  apparently  to  prevent  the  escape  of  a  maneuvering 
target.  The  moving  together  of  the  wings,  or  the  reaching  out  with  a  single  wing, 
sometimes  also  serves  to  bring  an  offside  target  to  the  center.  During  the  somersault 
the  tail  membrane  may  actually  loop  backwards,  as  here.  Once  the  target  is  lodged 
in  the  pouch  the  bat  appears  to  reorient  itself  before  reaching  down  for  seizure. 
(In  this  catch  one  out  of  four  mealworms  was  selected,  but  the  other  three  were 
omitted  for  clarity.)  C  marks  the  point  of  capture. 


89 


Bat  and  Ultrasonic  Principles  II 

1.  The  red  bat  has  very  strong  wings  and  can  accelerate  rapidly. 

2.  It  appears  capable  of  predicting  the  optimum  point  of  intercept  for 
a  rapidly  falling  target. 

3.  It  seems  able  to  make  a  good  evaluation  of  the  situation  at  some 
distance.  The  various  features  of  the  red  bat’s  interceptions  and  catches 
are  illustrated  in  Figures  10,  11,  12,  27,  28,  29,  30,  31,  and  32. 

Summary.  The  present  section  was  intended  to  answer,  in  part,  the  ques¬ 
tion:  How  do  bats  catch?  Broadly  speaking,  the  evidence  indicated  that 
bats  know  accurately  where  their  targets  are,  and  that  their  techniques  of 


Figure  27  Interceptions  of  Moths  by  Wild  Red  Bats:  Interception  of  Freely  Flying 
Moth.  The  moth  may  have  begun  to  turn  away  from  the  bat  at  the  second  flash, 
but  no  sudden  evasion  is  made.  (Traced  from  sequences  made  in  conjunction  with 
A.  E.  Treat  and  D.  A.  Cahlander.) 


Figure  28  Interceptions  of  Moths  by  Wild  Red  Bats:  Interception  of  Moth  Tossed 
Up  from  Below.  Despite  some  unevenness  of  trajectory  due  to  the  moth’s  initiation 
of  flight,  the  bat  appears  to  make  a  smooth  and  excellent  prediction  of  the  moth’s 
path. 


90 


Man-Machine  Systems 


Figure  29  Nonsomersault  Catch  by  Red  Bat.  This  catch  is  very  similar  to  the 
typical  tail  membrane  catch  of  Myotis  lucifugus.  The  last  image  shows  the  meal¬ 
worm  in  the  bat’s  mouth. 


catch  are  designed  for  a  balance  between  certainty  of  catch  and  speed  of 
execution.  With  smaller  targets  many  bats  seem  to  be  able  to  aim  so  pre¬ 
cisely  that  their  mouths  go  directly  to  the  point  of  intercept.  Accuracies 


91 


Bat  and  Ultrasonic  Principles  n 

within  plus-or-minus  a  half-centimeter  appear  typical.  The  tail  membrane 
either  assures  that  the  target  will  not  escape  or  aids  in  the  actual  catch. 
It  is  also  far  safer  for  the  bat  to  strike  large  targets  with  a  flexible  mem¬ 
brane  rather  than  with  the  mouth.  For  targets  further  from  the  direct 
flight  path  of  the  bat,  the  wings  are  used  to  pull  or  shovel  the  target  into 


Figure  30  Wing  Touch  of  Tightly  Spiraling  Moth  by  Lasiurus  borealis.  Perhaps 
the  most  notable  feature  of  this  and  some  other  similar  sequences  is  the  absence  of 
radical  following  of  the  target  typical  of  Myotis  lucifugus.  In  this  case  the  bat  ap¬ 
peared  to  judge  the  approximate  point  to  which  the  spiral  or  loop  would  bring  the 
moth.  The  bat’s  right  wing  was  rather  seriously  injured  prior  to  this  attempt  and 
presumably  accounted  for  the  fact  that  the  scooping  action  of  the  wing  resulted  in 
a  touch  rather  than  a  catch.  Dotted  markings  indicate  wing  action  of  moth,  here 
at  about  40  beats  per  second. 


the  tail  pouch.  Sometimes  a  rapidly  moving  target,  or  one  that  is  detected 
at  the  last  instant  (or  perhaps  maneuvers  violently),  appears  not  to  be 
precisely  followed  during  the  last  fraction  of  a  second.  The  mechanism  of 
catch,  however,  tends  to  funnel  the  target  into  the  center  of  the  tail  pouch, 
and  although  such  indirect  techniques  of  catch  may  take  more  time,  they 
are  ordinarily  successful. 

The  rates  at  which  bats  can  catch  small  targets,  such  as  fruit  flies, 


92  Man-Machine  Systems 

have  been  discussed  elsewhere  (14).  As  many  as  two  complete  catches  and 
perhaps  as  many  as  three  attempts  within  one  second  seem  not  uncommon. 
Sustained  rates  of  about  15  catches  of  fruit  flies  per  minute  have  been 


Figure  31  Attempt  at  Maneuvering  Moth  by  Myotis  lucifugus. 
In  contradistinction  to  the  rather  nonspecific  head  following  of  the 
red  bat,  Myotis  lucifugus  typically  attempts  to  follow  a  target’s 
maneuvers  in  detail.  Possibly  this  servo-action  works  poorly  at 
high  angular  velocities  or  accelerations,  or  the  bat  is  unable  to  profit 
from  the  indications.  In  this  instance,  certainly,  the  action  of  the 
bat’s  body  lags  so  seriously  behind  the  moth’s  maneuver  that  the 
bat  relinquishes  pursuit. 


observed  in  the  laboratory.  If  such  a  rate  were  sustained  for  an  hour,  the 
number  of  insects  caught  would  be  about  900.  Except  in  the  case  of  evad¬ 
ing  moths,  the  reliability  of  catch  is  sometimes  extraordinary.  Estimates 
suggest  that  95  percent  of  fruit  flies  attempted  are  sometimes  caught.  In 
one  series  of  tests,  a  bat  made  over  100  successive  catches  of  mealworms 


93 


Bat  and  Ultrasonic  Principles  n 

tossed  into  its  flight  path  without  a  failure,  despite  the  fact  that  many 
tosses  were  far  from  perfect  and  that  difficult  selections  from  possible  alter¬ 
native  targets  were  often  required.  In  summary,  the  bat’s  typical  catches 
can  be  described  as  extremely  rapid,  astonishingly  accurate,  and  some¬ 
times  almost  unbelievably  reliable.  The  general  rule  of  procedure  seems 


\ 


Figure  32  Successful  Catch  of  Turning  Moth  by  Myotis  lucifugus.  Downward 
turn  of  moth  appears  to  be  well  evaluated,  as  shown  by  accurate  reach  with  the 
wing.  At  the  third  image  the  moth  has  been  transferred  from  wing  tip  to  tail  mem¬ 
brane.  At  the  fourth  image  the  bat  is  resuming  flight  with  the  moth  in  its  mouth. 

to  be  that  the  bat  heads  accurately  for  the  target,  or  the  expected  point  of 
intercept,  but  has  at  its  command  techniques  of  catch  capable  of  com¬ 
pensating  for  many  of  the  last-instant  maneuvers  that  introduce  radical 
displacements  of  the  target  from  the  expected  point  of  catch. 

What  Can  a  Bat  Judge  About 

its  Target  in  Advance  of  Capture? 

Initial  studies  have  suggested  that  the  identification  and  selection  of  targets 
is  not  only  a  very  complex  problem  in  itself  but  that  its  experimental 
evaluation  is  fraught  with  complications.  To  cite  an  example:  A  Myotis 
lucifugus  was  being  tested  for  its  capacity  to  discriminate  between  meal¬ 
worms  and  a  rubber  disc  which,  when  seen  end  on,  appeared  about  the 
same  size  as  a  mealworm,  but  which  obviously  gave  much  greater  reflected 
echoes  when  at  right  angles.  During  an  early  phase  of  the  experiment  the 


94  Man-Machine  Systems 

average  score  taken  over  all  tosses  of  targets  was  such  that  no  significant 
discrimination  was  evident.  Closer  scrutiny,  however,  revealed  that  when 
the  bat  was  hungry  it  almost  invariably  selected  mealworms  and  when 
adequately  nourished  it  almost  always  chose  the  discs.  Although  the  bat 
was  clearly  capable  of  almost  perfect  discrimination,  a  casual  survey  of 
the  results  suggested  no  discriminating  ability  at  all.  In  many  other  of  the 
tests,  the  bats  undoubtedly  had  an  excellent  capacity  to  discriminate  be¬ 
tween  the  various  targets  encountered;  yet  since  all  were  harmless,  the  bats 
caught  whatever  appeared.  Indeed,  complete  suppression  of  the  inclination 
to  catch  (by  a  hungry  bat)  requires  negative  reinforcement.  Thus,  when  the 
red  bat  was  presented  with  various  sizes  of  spheres,  it  caught  almost  every¬ 
thing  from  a  2 Vi  mm  shot  to  a  70  mm  tennis  ball  of  25,000  times  the 
volume.  Nevertheless,  certain  results,  including  some  later  tests  with  the 
same  bat,  suggested  very  striking  discrimination  capacities.  One  Myotis 
bat  learned  quickly  to  make  perfect  discrimination  between  mealworms 
and  all  sizes  of  spheres  when  each  target  was  presented  singly.  In  sample 
tests  this  bat  also  showed  ability  to  discriminate  between  mealworms  and 
cylinders  that  must  have  given  very  similar  echoes.  Only  when  an  object 
approximated  almost  exactly  the  size  and  shape  of  a  mealworm  was  no 
discrimination  evident.  Impressive  selections  of  mealworms  from  several 
simultaneous  alternative  targets  were  also  sometimes  achieved. 

To  examine  the  stimuli  available  to  the  bat  for  the  discrimination 
process,  experiments  making  use  of  an  artificially  generated  set  of  bat 
pulses  are  planned.  The  tests  will  be  designed  to  provide  suitable  measures 
of  the  acoustical  scattering  properties  of  the  various  targets  discriminated 
by  bats.  Besides  measuring  the  properties  of  the  returned  echoes,  recordings 
will  be  made  for  comparative  analyses  between  human  listening  and  the 
observed  discrimination  capacities  of  bats.  For  direct  human  listening,  of 
course,  the  tapes  must  be  greatly  slowed  down  during  playback.  It  is  also 
likely  that  binaural  channels,  external  ear  simulation,  and  special  pre¬ 
processing  (e.g.,  beat  generation)  may  have  to  be  introduced  in  order  to 
transform  essential  stimulus  features  into  categories  producing  effective 
response  by  the  human  auditory  system. 

In  one  exploratory  test,  using  a  slow-down  factor  of  32,  the  echoes  from 
a  mealworm  and  a  sphere  of  like  size  appeared  indistinguishable,  when 
heard  monaurally,  to  human  listeners.  However,  both  the  quality  of  the 
recordings  and  the  listening  arrangements  were  too  crude  and  too  limited 
in  scope  for  more  than  a  preliminary  indication.  Offhand,  one  might  expect 
in  the  bat  far  higher  processing  speeds  and  special  auditory  adaptations  to 


95 


Bat  and  Ultrasonic  Principles  n 

the  required  analyses.  Experiments  serving  to  bring  out  the  echo  processing 
differences  of  bat  and  man  might  also  give  useful  clues  to  effective  echo- 
location  procedures  for  the  guidance  of  blind  persons.  What  we  learn  from 
the  methods  of  bats  may  thus  tell  us  a  great  deal  about  what  is  possible 
with  echolocation  methods  and  may  even  suggest  procedures  whereby  such 
methods  could  be  implemented. 

In  sum,  bats  are  capable  of  excellent  identification  of  simple  laboratory 
targets  (e.g.,  spheres,  cylinders,  discs,  and  mealworms)  within  the  fraction 
of  a  second  between  detection  and  capture.  Decision  to  avoid  an  inedible 
target  is  accompanied  by  an  immediate  cessation  of  the  rising  crescendo  of 
pulses  leading  to  the  catch.  Identification,  however,  seems  typically  better 
than  the  related  action  observed.  Bats  often  catch  any  airborne  object  of 
suitable  size  that  is  presented,  even  when  it  is  known  to  be  inedible;  at 
times  they  even  select  an  inedible  over  a  simultaneously  presented  edible 
target  which  they  can  clearly  distinguish. 

Virtually  nothing  is  known  about  the  details  of  natural  targets  that 
may  be  evaluated  by  a  bat.  Wing  action  is  perhaps  the  most  obvious  feature 
that  might  provide  crucial  information  both  as  to  the  nature  of  the  target 
and  its  expected  maneuvers.  Other  features  include  such  items  as  size, 
structure,  texture,  and  relative  orientation.  Analysis  of  many  such  features 
is  complicated  by  the  large  number  of  interrelated  variables. 

How  are  the  Bat’s  Echolocating  Signals  Related  to 

the  Interception  Problem  Being  Undertaken? 

Just  how  the  bat’s  echolocation  signals  are  related  to  the  evaluation  and 
interception  of  targets  appears  to  be  a  very  complex  matter  with  many 
obscure  facets.  The  general  nature  of  the  interception  signals  of  certain 
bats  has  been  well  described  by  Griffin  (7,  8).  Because  the  present  studies 
have  been  limited  to  interceptions  by  two  kinds  of  bats,  the  discussion  will 
deal  chiefly  with  these  two  bats:  Myotis  lucifugus  and  Lasiurus  borealis. 
Also,  since  details  of  the  signals  of  these  bats  have  been  discussed  else¬ 
where  (8,  14,  16,  17,  39),  the  possible  significance  of  many  features  will 
not  be  touched  on  here.  The  present  discussion  is  limited  to  a  rough  analysis 
of  how  the  signals  change  as  the  bat  intercepts  and  catches  its  prey. 

The  Interception  Signals  of  Myotis  lucifugus.  The  hunting  signals  of 
these  bats  are  conveniently  described  in  terms  of  three  phases  (14,  39): 
1)  search,  2)  approach,  and  3)  terminal  phase.  Probably  the  easiest  way 
to  picture  how  the  signals  shift  during  the  course  of  an  interception  is  in 


96  Man-Machine  Systems 

terms  of  the  two  extremes:  the  search  phase  and  the  terminal  phase.  During 
the  search  phase  the  pulses  are  relatively  long  and  occur  with  considerable 
spacing,  though  the  spacing  may  be  somewhat  irregular.  During  the  terminal 
phase,  or  “buzz,”  the  pulses  are  very  short,  occur  in  rapid  succession,  and 
with  very  regular  spacing.  It  seems  likely  that  the  echo  from  a  single  search 
pulse  can  provide  adequate  data  for  the  initiation  of  oriented  pursuit  be¬ 
havior.  By  contrast,  the  echoes  from  buzz  pulses  are  probably  analyzed 
in  groups. 

In  the  laboratory  the  search  pulse  of  a  Myotis  lucifugus  bat  has  a  dura¬ 
tion  of  2  to  4  milliseconds  (most  commonly  about  3)  and  recurs  with  a 
spacing  that  tends  to  run  from  about  50  to  100  milliseconds — in  other 
words,  at  roughly  10  to  20  pulses  per  second.  Each  pulse  starts  at  a  fre¬ 
quency  of  about  100  kilocycles  and  falls  fairly  uniformly  over  about  an 
octave,  ending  at  roughly  50  kilocycles.* 

Upon  detection  of  a  potential  target  or  a  near-by  obstacle,  the  interval 
between  pulses  tends  to  decrease  and  the  pulses  tend  to  shorten.  If  the  ob¬ 
ject  is  something  to  be  pursued,  the  bat  enters  the  second,  or  approach, 
phase.  Pulse  repetition  rate  goes  up  progressively,  but  often  very  irregularly 
(sometimes  with  interesting  pulse  groupings),  merging  within  a  quarter  to 
a  tenth  of  a  second  into  the  terminal  buzz.  During  the  approach  phase, 
pulse  duration  also  shortens,  but  often  not  in  a  uniform  way.  The  frequency 
span  swept  by  the  pulse  at  first  tends  to  remain  approximately  an  octave 
(sweeping  from  100  kc  to  40  or  50  kc),  but  soon  begins  to  slide  down 
to  lower  frequencies.  The  rate  of  frequency  sweep,  of  course,  tends  to  in¬ 
crease  as  the  pulse  length  shortens.  When  slowed  down  for  human  listening, 
the  series  of  transition  pulses  often  gives  a  momentary  impression  of  un¬ 
certainty,  then  picks  up  speed  in  a  positive  way  and  leads  into  the  terminal 
buzz. 

The  buzz  pulses  of  a  Myotis  lucifugus  typically  have  a  repetition  rate 
of  180  to  190  pulses  per  second,  the  total  buzz  consisting  of  anywhere  from 
about  a  dozen  to  perhaps  two  dozen  pulses,  from  one-third  to  one-half  a 
millisecond  in  duration.  The  frequency  sweep  tends  to  be  greatly  reduced, 
often  running  below  30  kilocycles  at  the  pulse  front,  and  sweeping  down 

*  In  the  present  illustrations,  frequency  is  indicated  by  use  of  a  zero-crossing 
meter  which  moves  up  one  centimeter  for  zero-crossing  intervals  running  from  100, 
000  to  50,000  per  second,  another  centimeter  for  frequencies  running  from  50 
thousand  to  33V3  thousand,  another  centimeter  for  frequencies  running  from  33V^ 
thousand  to  25,000,  and  so  on  (as  in  Figure  41).  The  line  representing  the  increase 
of  zero-crossing  interval  (decreasing  frequency)  is  essentially  straight  for  many  of  the 
pulses,  suggesting  that  the  shift  in  frequency  tends  to  be  more  or  less  hyperbolic. 


97 


Bat  and  Ultrasonic  Principles  II 

as  little  as  5  kilocycles.  Because  the  pulses  shorten  as  the  repetition  rate 
goes  up,  the  duty  cycle  may  not  be  significantly  altered,  tending  to  remain 
of  the  order  of  5  to  10  percent. 

In  analysing  the  relation  between  the  bat’s  emitted  signal  and  the  in¬ 
terception  problem  being  solved,  perhaps  no  single  item  is  of  greater  interest 
than  the  spacing  between  a  given  outgoing  pulse  and  the  primary  echo. 
Were  the  bat  to  attempt  accurate  range  determination  by  direct  measure¬ 
ment  of  the  time  between  the  emitted  pulse  and  the  corresponding  portion  of 
the  echo,  theoretical  analysis  (17,  22,  23,  24)  has  suggested  that  the  results 
could  not  provide  the  requisite  accuracy,  particularly  with  weak  echoes.  On 
the  other  hand,  if  the  emitted  pulses  occurred  in  well  standardized  units,  and 
a  technique  were  used  which  introduced  overlap  between  some  representa¬ 
tion  of  the  returning  echo  and  of  the  outgoing  pulse,  greater  precision 
might  be  achieved.  This  contention  is  debated  with  significant  neurological 
evidence,  however,  by  Grinnell  (15).  Preliminary  measurements  thus  far 
made  with  Myotis  pulses  indicate  that  direct  overlap  normally  does  not 
occur.  Instead,  a  gap  of  one  or  more  milliseconds  seems  typically  to  be 
left  between  the  end  of  the  emitted  pulse  and  the  beginning  of  the  re¬ 
turning  echo  of  chief  interest.  As  the  bat  gets  closer  to  its  target,  the  pulse 
shortens  so  as  to  keep  the  returning  echo  out  of  the  overlap  zone.  An 
illustration  of  these  relations  has  been  given  by  Webster  (38),  and  a  fur¬ 
ther  illustration  is  presented  in  Figure  33. 

The  echo  relations  seen  in  adult  Myotis  bats,  however,  do  not  preclude 
the  use  of  direct  overlap  in  the  signals  of  other  vespertillionid  bats,  or 
during  the  course  of  development  of  echolocating  techniques  by  young 
Myotis  bats.  (The  possible  occurence  of  “secondary  overlap” — overlap  be¬ 
tween  a  pulse  being  emitted  and  the  echo  returned  by  the  previous  pulse — 
is  illustrated  in  Appendix  Figure  52  during  pursuit  by  the  red  bat.) 
Moreover,  as  already  mentioned,  the  Old  World  horseshoe  bats  emit  pulses 
of  such  a  length  and  duty  cycle  that  most  of  the  echoes  must  come  back 
while  the  pulses  are  being  emitted.  As  will  be  shown  below,  these  bats 
appear  to  have  special  techniques  to  help  resolve  the  problem.  Even  with 
Myotis  bats  pulse  overlap  may  be  used  for  exclusion  measurements,  that 
is,  to  indicate  the  range  of  near-by  objects  which  are  to  be  avoided  during 
the  course  of  pursuit.  Also,  the  possibility  that  some  bats  may  intermit¬ 
tently  emit  low  level  “probe  pulses”  for  intendonal  interaction  between  out¬ 
going  probe  and  returning  echo  has  never  been  excluded.  At  this  juncture, 
however,  the  manner  in  which  a  bat  uses  the  indications  from  returning 
echoes  is  so  inadequately  understood  that  we  can  say  very  little.  How  the 


98  Man-Machine  Systems 

bat  derives  the  various  categories  of  information  that  it  needs  to  achieve 
the  rapid  and  successful  pursuits  so  commonly  seen  will  call  for  much 
more  extensive  study. 

The  Signals  of  Red  Bats.  Basically,  the  pursuit  signals  of  red  bats  (Figure 
34)  are  quite  similar  to  those  of  Myotis  lucifugus.  The  chief  differences 
can  perhaps  be  summarized  as  follows: 

1 )  The  outdoor  cruising  pulse  of  the  red  bat  is  considerably  longer 
than  that  of  Myotis ,  often  reaching  about  10  milliseconds  in  duration. 


Figure  33  Relation  of  Emitted  Signal  to  Pursuit  Action  in  Myotis.  During  this 
relatively  straight  interception  the  bat  reduced  its  speed  very  little.  Of  particular 
interest  are  1)  the  position  of  the  terminal  “buzz”  relative  to  the  intercept  point, 
and  2)  the  relation  of  emitted  pulses  to  returning  echoes.  The  buzz,  which  here 
consists  of  only  9  or  10  pulses,  does  not  start  until  the  bat  is  about  10  inches  from  its 
target.  It  continues  almost  to  the  point  of  contact  (and  in  Figure  20  continued 
briefly  after  contact).  The  position  of  the  returning  echoes  relative  to  the  outgoing 
pulses  is  easy  to  calculate,  since  the  sound  goes  out  and  back  about  63A  inches 
(17  cm)  per  millisecond.  Either  an  electrical  or  an  acoustical  disturbance  appears 
on  the  sound  record  with  each  flash,  and  is  used  (with  correction  for  acoustical 
delay)  to  establish  synchrony.  Outgoing  pulses  are  shown  on  the  upper  side  of  the 
flight  line,  returning  echoes  below.  Pulse  length  is  shortened  to  keep  primary  echo 
well  out  of  overlap  zone  (i.e.,  primary  echo  is  out  of  zone  of  overlap  of  pulse 
being  emitted). 

2)  While  the  frequency  structure  of  the  cruising  pulse  (except  for  a 
flat  terminal  portion)  corresponds  roughly  to  that  of  the  Myotis  pulse,  the 
structure  changes  very  little  in  the  transition  from  cruise  to  buzz  (the  upper 
frequencies  dropping  only  from  about  85  or  90  to  perhaps  60  or  so). 
There  is,  of  course,  a  definite  steepening  of  the  sweep  rate  as  the  pulses 
shorten. 

3)  More  sudden  and  pronounced  variations  sometimes  occur  among 


Bat  and  Ultrasonic  Principles  n  99 

such  features  as  pulse  amplitude,  pulse  shape,  pulse  duration,  and  interpulse 
interval. 

4)  Longer  buzzes  (exceeding  30  pulses  in  length)  are  not  uncommon. 

5)  Secondary  overlap  (i.e.,  with  echo  from  previous  pulse)  may  some¬ 
times  occur  during  pursuit. 

Some  of  the  variations  in  pulse  patterns,  during  interceptions,  are  shown 
in  Figures  35  and  36.  There  is  a  possibility  that  the  red  bat  uses  the  back- 


Figure  34  Relation  of  Emitted  Signal  to  Pursuit  Action  in  Lasiurus.  This  pursuit 
was  chosen  for  comparison  because  of  its  general  similarity  in  form  to  Figure  33. 
Besides  the  typically  higher  initial  approach  speed,  features  of  interest  are  1 )  the 
longer  and  more  distantly  placed  buzz,  and  2)  the  resultant  positioning  of  the  echoes 
closer  to  the  center  of  the  interpulse  interval.  The  buzz  starts  just  over  two  feet 
away,  and  ends  about  five  inches  from  the  point  of  contact.  Though  in  most  cases 
the  returning  echoes  are  closer  to  the  corresponding  emitted  pulse,  they  sometimes 
occur  beyond  the  center  of  the  interval  and  may  lead  to  secondary  overlap  (i.e., 
with  succeeding  pulses).  The  apparent  terminal  speed-up  of  the  pulses  is  actually 
due  to  the  slowing  of  the  bat.  Further  comparisons  are  given  in  Figure  35. 

ward  somersault  technique  when  it  expects  an  interception  of  some  diffi¬ 
culty  or  complexity.  There  is  evidence  that  the  red  bat  may  make  some¬ 
what  different  use  of  its  signal  indications  than  does  Myotis  lucifugus,  but 
just  what  such  differences  actually  are  remains  to  be  determined. 

The  Signals  of  Rhinolophus  jerrum-equinum.  While  adequate  studies  of 
Rhinolophus  bats  during  pursuit  have  not  been  completed,  some  synchro¬ 
nized  recordings  and  high  speed  films  have  been  made  of  Rhinolophus 
bats  landing  (9,  25)  and  catching  mealworms.  The  landing  films  showed 
that  during  the  final  part  of  the  approach  (in  one  instance  at  least)  pulse 


LASIURUS 


X 

o 

I — 
< 
o 


8 

ro 


(O 

o 

z 

o 

o 

Ul 

CO 

3 

-I 

i 


lli 


Figure  35  Pulse  Patterns  During  Interceptions.  Pulse  repetition  patterns  for  the  catches  illustrated  in  Figures 
33  ( Myotis )  and  34  {Lasiurus) .  The  ordinate  is  interval  between  pulses,  in  milliseconds,  and  the  abscissa  is  time 


101 


Bat  and  Ultrasonic  Principles  n 

repetition  rate  increased  to  about  75  per  second.  During  pursuit,  rates  up 
to  88  per  second  were  noted.  However,  in  contradistinction  to  the  FM 
bats  discussed  above,  the  duty  cycle  remains  so  high  (approximately  90 
percent)  that  most  echoes  must  continue  to  be  received  during  the  emission 
of  pulses.  Adequate  time  reference  between  pulse  and  echo  probably  could 
hardly  be  established  in  a  positive  way  without  measurements  other 
than  simple  echo  time.  The  most  notable  additional  feature  of  the  signal 
reception  system  of  this  bat  is  the  occurrence  of  rapid  oscillations  of  the 
ears.  Various  kinds  of  oscillatory  action  have  been  described  (33),  the 
one  of  most  immediate  interest  being  the  accelerating  alternation  of  left 
and  right  ears  seen  during  the  bat’s  approach  to  a  landing,  during  most 
pursuits,  and  also  seen  when  an  object  is  brought  up  close  to  a  stationary 
bat.  Examples  of  this  oscillatory  motion  are  shown  in  Figures  37  and  38. 

Of  special  interest  is  the  fact  that,  during  part  of  this  rapid  oscillatory 
phase  at  least,  there  is  an  almost  exact  one-to-one  correspondence  between 
each  pulse  (which  is  emitted  through  the  nose)  and  a  specific  action  of 
the  ears,  though  the  phase  relations  between  the  two  may  shift.  Apparent 
onset  of  alternation  during  decision  between  two  targets  is  shown  in  Figure 
38.  Various  implications  of  this  mechanism  have  been  discussed  elsewhere, 
and  need  not  be  described  here.  One  relevant  point  to  note,  however,  is 
that  either  the  neurological  “commands”  to  the  ears,  or  the  proprioceptive 
indications  of  such  action,  may  provide  the  requisite  time  reference  on 
which  the  bat  anchors  measurements  of  range.  This  appears  to  be  but  one 
more  example  of  the  many  ingenious  mechanisms  that  evolution  has  pro¬ 


to  contact  with  target,  also  in  milliseconds.  One  convenient  way  to  visualize  the 
indications  of  these  plots  is  to  imagine  a  diagonal  sweep  (of  uniform  rate)  starting 
at  the  base  line  at  the  beginning  of  each  pulse,  a  given  sweep  terminating  when  the 
next  pulse  occurs.  Both  ordinate  and  abscissa  spacings  are  thus  linearly  related  to 
interpulse  interval.  Possibly  the  most  conspicuous  feature  of  difference  between  the 
Myotis  sequence  and  the  Lasiurus  sequence  is  the  fact  that  the  long  terminal  buzz  of 
Lasiurus  has  almost  ended  before  the  terminal  buzz  of  Myotis  has  started.  Though 
such  a  large  discrepancy  may  not  be  entirely  representative,  and  extensive  individual 
variations  also  occur,  it  is  to  some  extent  typical  of  one  major  difference  in  the 
approach  signals  of  the  two  kinds  of  bats.  Another  obvious  difference  is  in  the 
repetition  rate  of  the  terminal  pulse  sequence  (buzz).  For  the  Myotis  record  the 
rate  is  about  180  pulses  per  second,  and  for  the  Lasiurus  record,  about  218  pulses 
per  second.  Other  records  of  these  bats  (some  made  in  the  wild)  have  shown  like 
differences.  The  long  pause  after  the  catch  occurs  because  the  bat’s  head  is  down  in 
the  pouch  for  seizure  of  its  prey.  An  unsuccessful  attempt  or  reorientation  prior  to 
seizure  with  the  mouth  results  in  a  shorter  pause.  (Other  records  of  the  signals 
associated  with  interceptions  and  attempts  are  given  in  References  7,  8,  14,  15,  31, 
and  38.) 


102 


Man-Machine  Systems 


TIME  TO  ASSUMED  POINT  OF  CATCH  (MSEC) 


Figure  36  Pulse  Repetition  Patterns  During  Catches  by  Lasiurus  borealis.  The 
two  sequences  shown  here  were  chosen  because  they  approximated  two  extremes 
of  the  interception  signals  for  catches  by  Lasiurus  in  the  laboratory.  The  first  illus¬ 
trates  a  definite  and  relatively  uniform  transition  into  the  terminal  sequence,  while 
the  second  shows  a  transition  with  alternation  of  longer  and  shorter  spacings  and 
some  irregularity  at  the  start  of  the  long  buzz.  Current  evidence  is  inadequate  to 
specify  the  significance  of  the  alternating  or  irregular  groupings  and  long  buzzes 
that  sometimes  occur.  One  guess  is  that  they  may  be  related  to  selection  or  location 
of  a  target  in  a  complex  situation  (complex  near-by  configurations,  multiple  targets, 
irregular  target  motion,  etc.).  Pulse  gaps  in  the  sequence,  as  seen  in  the  signals  of 
Figure  12,  are  also  common.  Above  the  pulse  interval  plot  for  Number  2  are  plots 
giving  a  rough  indication  of  pulse  duration  and  frequency  range.  The  chief  feature 
of  note  relative  to  corresponding  records  of  Myotis,  is  the  continuing  rather  high 
initial  frequency  and  large  frequency  sweep  throughout  the  buzz.  Differences  between 
sequences  in  the  laboratory  and  in  the  wild  may  be  seen  by  comparing  a  typical 
outdoor  sequence  (Figures  52,  53,  and  54)  with  the  present  records. 


Bat  and  Ultrasonic  Principles  n 


103 


Figure  37  Ear  Alternation  During  Inspection  of  Close-up  Object.  This  illustration 
is  the  start  of  alternation  in  film  sequence  105-5  of  Reference  9.  At  frame  Number  1 
the  right  ear  is  just  starting  back  after  the  head  has  been  moved  forward  in  a 
scanning  action  with  both  ears  symmetrical.  The  first  pulse  of  the  present  group 
of  eight  ends  as  the  right  ear  reaches  its  full  back  position  (see  frame  Number  3) 
and  another  pulse  starts,  about  four  milliseconds  later,  roughly  as  the  right  ear 
moves  forward  again.  This  pulse  ends  (about  25  milliseconds  later)  when  the  ear 
positions  are  reversed  (see  frame  Number  6).  There  is  a  sudden  increase  in  rate 
after  the  first  three  pulses,  but  ear  action  and  pulse  timing  remain  well  synchronized 
during  the  five  yet  more  rapid  pulses  that  follow.  At  this  point  the  synchrony 
breaks  up  and  a  new  phase  begins.  (The  interval  between  frames  is  7.8  milliseconds.) 


104  Man-Machine  Systems 

duced  to  provide  bats  with  the  extraordinarily  rapid,  precise,  and  reliable 
echolocation  techniques  which  current  studies  have  amply  demonstrated. 

Summary.  Existing  knowledge  of  the  relations  between  the  echolocating 
signals  and  the  interception  procedures  governed  is  not  adequate  for  proper 
summary.  A  few  general  comments  can,  however,  be  made. 

1)  Definite  relations  exist  between  aspects  of  the  bat’s  emitted  signals 
and  certain  definable  features  of  the  interception  situation.  For  example, 
pulse  repetition  rate  and  pulse  duration  are  related  to  distance  to  target 


Figure  38  Ear  Action  During  Decision  Between  the  Pursuit  of  Two  Targets.  Ear 
action  in  these  bats  appears  to  vary  greatly  during  the  final  phases  of  pursuit.  Some¬ 
times,  ear  alternation  continues  to  the  point  of  catch;  at  other  times  the  alternation 
diminishes  in  excursion  until  both  ears  are  directed  forward  symmetrically  from  a 
distance  as  great  as  a  foot  away  from  the  target.  Catch  techniques  are  highly 
diversified  and  commonly  involve  extensive  use  of  the  wings.  In  one  of  the  current 
sequences,  for  example,  the  target  was  passed  from  the  left  wing  to  the  right  wing 


105 


Bat  and  Ultrasonic  Principles  n 

and  phase  of  interception.  (Since  interception  velocities  vary  by  a  factor 
of  at  least  10,  distance  and  phase  are  not  equivalent.)  In  the  temperate 
New  World  bats  pulse  length  appears  to  be  shortened  so  as  to  avoid  primary 
overlap  between  emitted  pulse  and  returning  echo.  Repetition  rate  may 
increase  by  a  factor  of  100  between  the  phase  representing  search  for  a 
distant  target  and  the  terminal  phase  immediately  preceding  a  catch  (for 
example,  within  one  foot).  Even  in  the  Old  World  horseshoe  bats,  which 
use  pulse  overlap,  terminal  rates  may  reach  almost  half  the  maximum 
rates  seen  in  typical  New  World  bats,  the  latter  typically  approximating  200 
pulses  per  second. 

2)  There  is  some  evidence  that  features  of  the  emitted  signal  are  re¬ 
lated  to  difficulty,  or  expected  difficulty.  Irregularities  in  repetition  pattern 
and  perhaps  in  pulse  structure  often  appear  to  increase  in  such  situations. 
Prolongation  of  the  approach  phase  or  terminal  phase  may  also  occur. 

3)  The  decision  to  relinquish  an  attempt  at  interception,  or  to  defer 
final  approach  pending  a  preparatory  maneuver,  is  accompanied  by  an 
immediate  drop  in  repetition  rate. 

4)  The  rapidity  with  which  the  emitted  signal  can  be  altered  in  ac¬ 
cordance  with  sudden  changes  in  the  interception  situation  is  not  estab¬ 
lished.  Shifts  with  unexpected  events  do,  however,  sometimes  seem  to  take 
place  in  one-twentieth  of  a  second  or  less. 

5)  Certain  receiving  mechanisms  (the  aim  of  the  outer  ear,  for  ex¬ 
ample)  are  strongly  related  to  signal  emission  (or  to  expected  echo)  in 
some  bats.  In  others  little  external  action  is  evident.  There  may,  however, 
be  many  features  of  the  ear  mechanism  which  present  observations  over¬ 
look. 


CONCLUSIONS 

In  concluding,  an  attempt  must  be  made  to  answer  the  question,  “What  do 
interceptions  by  bats  have  to  do  with  the  guidance  of  blind  human  beings?” 
In  broadest  terms,  a  suitable  general  answer  might  be  this:  The  echolocat- 
ing  procedures  of  the  bat  during  interceptions  bring  out  the  bat’s  techniques 


and  back  to  the  left  wing  again  for  seizure  with  the  mouth.  In  contradistinction  to 
the  evidence  of  previous  investigators  (40),  current  high  speed  films  indicate  that 
the  tail  (interfemoral)  membrane  is  sometimes  used,  either  directly  for  the  catch 
or  in  conjunction  with  a  wing,  prior  to  seizure  by  the  mouth.  The  present  illustra¬ 
tion  is  a  two-flash  sequence  showing  what  appears  to  be  the  transition  from  the  ear- 
forward  approach  (commonly  seen  at  close  range  to  a  target)  to  the  alternating  ear 
inspection  of  a  more  distant  alternative  target,  which  was  caught. 


106  Man-Machine  Systems 

at  peak  performance.  The  study  of  the  bat’s  peak  performance  may  reveal 
details,  methods,  and  principles  normally  masked  in  the  less  exacting  tasks 
which  might  seem  more  logically  related.  Certain  aspects  of  the  problems 
interpose  obvious  difficulties  when  attempts  are  made  to  convert  the 
findings  in  bats  to  the  development  of  methods  for  human  beings.  For 
example: 

1 )  Attempts  at  direct  application  of  the  bat’s  signals  to  human  use 
tend  to  be  disappointing,  even  when  suitable  frequency  conversions  are 
made.  The  bat’s  system  was  designed  for  different  problems  and  probably 
with  quite  a  different  processing  emphasis.  At  the  same  time,  a  knowledge 
of  how  the  bat  processes  its  echolocation  data  might  well  provide  extremely 
valuable  clues  for  methods  adaptable  to  human  beings. 

2)  The  acoustical  guidance  of  bats  is  dominated  very  heavily  by  the 
requirement  of  speed.  The  focal  point  of  a  bat’s  system,  perhaps,  is  its  ca¬ 
pacity  to  guide  pursuits  of  small  and  rapidly  moving  targets.  In  studying 
the  bat's  system,  as  used  in  other  applications,  the  dominant  role  of  this 
particular  function  in  the  evolution  of  its  methods  must  constantly  be  kept 
in  mind.  Possibly  the  bat's  system  can  give  many  clues  to  acoustical  guid¬ 
ance  where  speed  is  important.  Taking  clues  from  bats,  for  example,  highly 
effective  methods  might  be  developed  for  effective  blind  guidance  during 
such  sports  as  tennis,  volley  ball,  trampolining,  and  skiing  (38). 

There  is  a  final  point  which  is  especially  worthy  of  note.  The  bat’s 
mechanisms  are  ultra-miniaturized.  The  bat’s  extraordinary  achievements 
are  done  with  analytical  equipment  that  may  weigh  only  one-tenth  of  a 
gram.  The  bat,  moreover,  does  not  have  the  advantage  of  microsecond 
speed  elements.  Its  elements  take  a  millisecond  or  more  to  act.  The  bat’s 
system,  therefore,  cannot  squander  its  components  or  its  time.  Probably 
it  must  often  make  near-optimum  use  of  many  facets  of  the  information 
it  receives.  At  last  we  are  reaching  an  effective  position  for  probing  into 
the  nature  of  these  mechanisms.  What  we  find  there  seems  destined  to 
open  many  doors. 

appendix:  ultrasonic  INTERRELATIONS  of 

BATS  AND  MOTHS:  A  DIAGRAMMATIC  SUMMARY 

INTRODUCTION 

Events  that  shape  the  pursuit-escape  relations  of  bats  and  moths  have  such  a 
long  evolutionary  history  that  many  complex  details  of  each  creature  may  have 
been  shaped  by  the  survival  value  of  trends  in  the  conflict.  Already  enough  is 
known  to  warrant  an  attempt  at  systematic  representation.  Although  any  such 
schematized  representation  oversimplifies,  it  may  also  help  in  visualizing  the 


107 


Bat  and  Ultrasonic  Principles  n 

kinds  of  forces  that  shape  the  general  picture.  Certainly  the  bat-moth  problem 
is  an  unusually  intriguing  example  of  evolutionary  interaction,  containing  per¬ 
haps,  many  features  beyond  those  we  can  now  see — or  even  guess. 

Fortunately  a  number  of  the  basic  elements  have  been  painstakingly  in¬ 
vestigated  and  clearly  presented  by  Roeder  and  Treat,  and  by  other  investigators 
(2,  27  to  32,  34  to  37).  Certain  of  the  broader  features  of  the  so-called  “bat- 
moth  battle”  have  also  been  described.  The  intention  here  is  to  tie  together  the 
general  form  of  existing  findings  rather  than  to  report  in  detail  the  quantitative 
results  of  individual  experiments. 

GENERAL  FORM  OF  BAT-MOTH  INTERRELATIONS 

Figure  39*  illustrates  in  a  rough  way  a  possible  physical  situation  where  a  bat 
is  closely  approaching  a  flying  moth.  Here  the  bat  is  flying  more  or  less  toward 
the  moth,  while  the  moth  is  moving  roughly  at  right  angles  to  the  path  of  the 
bat.  Figure  40  gives  a  corresponding  vector  representation.  Starting  to  echo 
back  from  the  moth  is  a  typical  terminal  pulse  of  a  Myotis  lucifugus.  The  pulse, 
shown  approximately  to  scale,  is  about  1/3  millisecond  in  duration  and  sweeps 
down  roughly  8  kc  from  an  initial  frequency  of  30  kc.  Figure  41  gives  samples 
of  moth  pulses  recorded  outdoors  and  illustrates  typical  changes  in  the  pulses 
of  bats  during  the  initiation  of  pursuit.  Photographs  of  situations  corresponding 
to  Figure  39  have  shown  that  at  this  juncture  the  bat  might:  1)  catch  the  moth, 
probably  here  with  the  use  of  the  right  wing,  2)  make  an  unsuccessful  attempt 
at  capture  (through  commonly  with  an  ensuing  successful  attempt),  or  3)  de¬ 
liberately  avoid  the  moth.**  In  this  latter  case,  it  is  unlikely  that  the  buzz-type 
pulse  illustrated  would  be  emitted,  since  the  decision  to  relinquish  pursuit  appears 
to  be  followed  by  immediate  cessation  of  such  pulses.  Examples  of  actual  en¬ 
counters  are  given  in  Figures  42,  43,  and  44. 

Figure  45  shows  very  schematically  a  few  potentially  significant  message 
paths  in  bat-moth  encounters.  The  blocks  specifying  functional  categories  are 
intended  to  give  only  the  crudest  idea  of  the  vast  array  of  messages  and  opera¬ 
tions  that  actually  occur,  and  no  resemblance  to  actual  anatomical  arrange¬ 
ments  is  intended.  The  chief  feature  to  note  is  the  complexity  of  message 
categories  even  for  the  simple  situation  pictured.  The  bat  must  clearly  be 
capable  of  quick  and  effective  exclusion  of  detail  irrelevant  to  the  immediate 
pursuit,  yet  at  the  same  time  be  capable  of  meaningful  evaluation  of  the  many 
constraining  obstacles  and  configuratons  that  dictate  its  flight  path. 

AUDITORY  SYSTEM  OF  BAT: 

SURVEY  OF  POTENTIALLY  RELEVANT  FEATURES 

The  general  arrangement  of  a  bat’s  auditory  system  is  shown  schematically  in 
Figure  46.  Since  the  system  is  highly  intricate  and  is  discussed  extensively  by 


*  Figures  for  the  Appendix  begin  on  page  115. 

**  There  are  also  transitions  or  combinations  as,  for  example,  a  catch  or  hit  with 
apparently  deliberate  failure  to  retain.  Some  of  these  may  be  test  touches  or  catches 
serving  to  confirm  or  correct  an  acoustical  evaluation. 


108  Man-Machine  Systems 

Grinnell  (15)  only  a  few  brief  comments  need  be  made  here.  Certain  features 
are  of  special  interest,  particularly  when  viewed  in  contrast  to  the  vastly  simpler 
system  of  the  moth.  The  following  are  examples. 

1)  Extensive  frequency  and  intensity  coding  occurs  peripherally  (for  ex¬ 
ample,  by  way  of  the  hydrodynamic  and  transduction  systems  that  activate  the 
first  order  acoustical  fibers). 

2)  Some  rapid  time  sequence  coding  may  also  occur  peripherally  by  way  of 
the  outer  ear  configuration,  inner  ear  hydrodynamics,  and  transducer  coding 
mechanisms  at  the  point  of  neural  excitation.* 

3)  Since  efferent  messages  may  travel  out  the  auditory  nerve  as  far  as  the 
transducing  mechanisms,  some  selective  gating  or  shaping  of  incoming  messages 
(under  central  control)  may  occur  as  far  peripherally  as  the  inner  ear.  Pre¬ 
sensitization  also  occurs  in  the  auditory  centers  at  certain  intervals  after  pulse 
emission,  suggesting  that  increased  sensitivity  may  act  as  a  gating  mechanism 
for  the  reception  of  echoes  at  specified  times. 

4)  Interaction  between  messages  from  the  two  ears  occurs  at  several  levels 
and  probably  by  way  of  a  number  of  different  neural  mechanisms.  Earliest 
binaural  interaction  may  occur  at  the  large  cochlear  nuclei,  where  the  first  mesh- 
work  of  auditory  synapses  is  located;  but  the  most  extensive  low  level  inter¬ 
action  presumably  takes  place  in  the  olivary  complexes  at  the  two  sides.  Some 
interaction  also  occurs  in  the  reticular  formation  at  various  levels,  but  the  chief 
higher  level  locus  of  interaction  appears  to  be  the  posterior  colliculus  (next 
main  station  of  the  ascending  auditory  pathway).  Grinnell’s  experiments  (15) 
indicate  that,  among  other  actions,  what  happens  at  one  ear  may  rapidly  alter 
sensitivity  in  the  other  auditory  channel. 

5)  Certain  centers,  such  as  the  exceptionally  prominent  nucleus  of  the 
lateral  lemniscus  (also,  perhaps,  parts  of  the  reticular  formation),  may  have 
important  functions  in  the  selective  control  of  left-right  interactions. 

6)  The  posterior  colliculus  is  extremely  large  and  complex  and  may  be  the 
main  integrating  station  for  complex  incoming  messages. 

7)  Higher  stations — notably  the  medial  geniculate  and  auditory  cortex — 
are  small  and  undeveloped  relative  to  the  hyperdevelopment  of  the  cochlear 
and  accessory  nuclei  and  the  posterior  colliculus.  This  is  presumably  in  line 


*  Relevant  here  are  the  experiments  of  Batteau  and  collaborators  ( 1 )  on  human 
auditory  localization  with  the  use  of  high  frequency  components,  partially  discussed 
by  Mills  (19).  Basic  to  the  relatively  direct  localizing  indications  observed  with 
stimulus  configurations  incorporating  sound  components  above  perhaps  7500  cps 
are:  a)  a  complex  sound  of  adequate  bandwidth  and  b)  a  multipath  channel  to  the 
coding  mechanisms  of  the  inner  ear.  The  time  delays  between  channels  (involving 
mostly  time  separations  of  less  than  1/10  millisecond)  are  presumably  coded  into 
a  spatial  dimension  along  the  basilar  membrane  and,  with  brief  intervals  of  cor¬ 
relation,  mapped  uniquely  into  external  spatial  categories.  Azimuth,  elevation,  and 
approximate  range  of  a  discrete  source  can  sometimes  be  established  with  remarkable 
rapidity  and  accuracy  without  the  use  of  sequential  comparisons  or  any  motion  by 
the  head  or  ear.  However,  precision  and  certainty  of  evaluation  are  enhanced  by  the 
separation  of  location  and  quality  made  possible  with  more  extended  correlations 
and  with  the  use  of  two  ears. 


Bat  and  Ultrasonic  Principles  n  109 

with  the  bat’s  need  for  extremely  rapid  analysis  and  evaluation  of  received 
signals. 

8)  The  various  centers  are  close  together,  favoring  rapid  conduction. 

9)  Much  neurological  structure  is  very  compact,  suggesting  preset  mecha¬ 
nisms  of  relatively  simple  routing  and  restricted  processing  operations. 

10)  Various  mechanisms,  including  those  of  the  ear  dynamics,  seem  suited 
for  ultrasonic  frequency  discrimination. 

1 1 )  Quick  reflex  muscular  action,  probably  both  of  the  middle  ear  and  of 
the  outer  ear,  may  act  to  modify  over-all  sensitivity — and  possibly  directional 
sensitivity — within  10  or  20  milliseconds  of  stimulus  onset  or  radical  change  of 
quality. 

12)  Various  fairly  direct  motor  pathways  may  lead  out  at  several  different 
levels  along  the  auditory  pathways,  suggesting  that  in  comparison  with  many 
mammals  there  is  greater  emphasis  on  the  initiation  of  early  responses  to  re¬ 
ceived  signals. 

A  bat’s  auditory  system  thus  seems  specialized  for:  a)  rapid  and  selective 
reception  of  auditory  signals,  b)  appreciation  and  interpretation  of  closely- 
spaced  temporal  structure  and  quick  evaluation  of  shifts  in  the  stimulus  pattern, 
c)  quick  and  effective  detection  of  left-right  stimulus  differences,  d)  automatic 
gating  for  expected  relevant  stimulus  values,  e)  excellent  ultrasonic  frequency 
discrimination,  f)  fast  initial  motor  responses  to  external  changes  as  reflected 
via  shifts  in  echo-pattern.  Such  features  clearly  fit  it  well  for  the  job  of  rapid 
evaluation  of  echo-complexes  and  for  a  rapid  conversion  of  the  indication  into 
appropriate  pursuit  action. 

TYMPANIC  SYSTEM  OF  MOTH 

The  moth’s  auditory,  or  tympanic,  system  is  vastly  less  complex  than  that  of  a 
bat.  Because  of  its  relative  simplicity,  and  because  the  moth’s  acoustical 
evaluations  introduce  certain  basic  determinants  into  many  of  a  bat’s  more 
crucial  pursuits,  a  somewhat  more  specific  review  of  what  is  known  of  the 
moth’s  system  may  be  in  order. 

Experiments  by  Roeder  and  Treat  (27  to  32  and  34  to  37*)  have  provided 
a  useful  orientation  to  the  level  of  analytical  complexity  and  the  range  of  re¬ 
sponse  times  that  may  reasonably  be  expected  of  a  moth.  Their  observations, 
for  example,  suggest  the  following. 

a)  Only  the  crudest  frequency  discrimination,  if  any  at  all,  occurs  (which 
is  also  consistent  with  the  absence  of  any  linear  array  of  frequency-ordering 
elements  such  as  typically  provides  the  basis  of  frequency  coding  in  higher 
animals). 

b)  Binaural  directional  sensitivity  is  present  for  lower  intensities  of  a  bat’s 
signals,  and  may  act  in  part  to  turn  the  moth  away  from  the  bat,  thus  reducing 
reflectivity  to  the  bat’s  signals. 

*  A  recent  paper  by  Roeder  (in  Animal  Behaviour,  Vol.  10,  Nos.  3/4  (1962), 
pp.  300-304)  presents  further  details  of  the  responses  of  free-flying  moths  to  bat- 
type  signals. 


110  Man-Machine  Systems 

c)  Response  times  are  slow  compared  to  those  of  a  bat:  mostly  1/5  to  1/2 
second  (and  often  longer)  for  the  moth  as  against  about  1/10  to  perhaps  1/30 
of  a  second  for  a  bat. 

d)  The  simplicity  of  the  moth’s  system  limits  any  analysis  of  the  bat's 
signals  to  rather  few,  rough  categories  (mostly  relating,  in  all  probability,  to 
the  direction  and  proximity  of  the  bat).  Irregular  or  random  features  in  the 
properties  of  the  moth's  associated  motor  responses  (perhaps  particularly  with 
respect  to  timing)  may,  however,  greatly  complicate  the  bat's  prediction 
problem. 

Figure  47  is  a  very  schematic  view  of  the  tympanic  system  of  a  typical 
sound-sensitive  moth  (as,  for  example,  a  noctuid).  The  sequence  of  events, 
upon  reception  of  an  acoustical  signal,  can  be  outlined  roughly  as  follows. 
Vibration  of  the  tympanic  membrane  causes  the  transducing  elements 
(scolopes)  to  activate  the  dendrites  (receptor  fibrils)  of  the  sensory  cells.  Rele¬ 
vant  aspects  of  the  acoustical  signal  (as,  for  example,  intensity,  duration,  and 
perhaps  certain  groupings)  are  coded  into  the  firing  pattern  of  the  nerve  spikes 
originating  in  these  two  cells.  The  axons  of  the  cells  enter  the  tympanic  nerve 
which  transmits  nerve  impulses  to  the  pterothoracic  ganglion  in  the  mesothorax 
of  the  moth.  Very  little  is  known  about  the  neural  operations  that  take  place  in 
this  ganglion.  It  is  clear,  however,  that  selective  and  organized  motor  responses 
occur  as  a  result  of  acoustical  stimulation.  The  implication  is  that  appropriate 
categories  of  organized  motor  response  can  be  activated  or  inhibited  in  ac¬ 
cordance  with  indications  derived  by  way  of  the  tympanic  organs  at  the  two 
sides.  Experience  and  learning  presumably  play  little  if  any  role  in  shaping  the 
behavior  of  the  moth. 

Since  the  tympanic  fibers  are  accessible  for  electrical  recording,  exploration 
of  the  incoming  nerve  impulses  in  response  to  acoustical  stimulation  has  been 
possible.  Studies  of  the  responses  of  intact  moths  have  also  been  made.  The 
simplest  way  to  review  the  main  observations  is  in  four  stages,  as  follows. 

A.  Input-output  relations  for  a  single  tympanic  fiber,  using  as  inputs  acousti¬ 
cal  stimuli  of  standard  form  that  approximate  natural  ones; 

B.  Relations  between  responses  in  the  two  acoustical  fibers  at  one  side; 

C.  Relations  between  the  responses  in  the  pairs  of  acoustical  fibers  at  the 
two  sides;  and 

D.  Samples  of  responses  of  intact  moths  to  acoustical  stimuli. 

Fortunately  the  known  relations  are  relatively  simple  and  can  readily  be  sum¬ 
marized  in  graphic  form. 

SOME  INPUT-OUTPUT  RELATIONS  FOR  A 
SINGLE  TYMPANIC  FIBER 

Figures  48,  49,  and  50  illustrate  in  schematic  form  a  possible  recording  ar¬ 
rangement  and  give  samples  of  stimulus-response  situations.  If  we  assume  a 
standard  ultrasonic  pulse  of  fixed  duration  (say  2  milliseconds)  but  of  variable 
level  or  intensity,  then  the  main  response  shifts  that  occur  with  changes  of 
signal  level  may  be  defined  in  terms  of:  1)  latency  in  the  appearance  of 
spikes,  2)  duration  of  the  spike  group,  3)  total  number  of  spikes,  and  4)  initial 


Ill 


Bat  and  Ultrasonic  Principles  II 

rate  of  spike  discharge.  In  the  end,  of  course,  other  stimulus  parameters  than 
level  or  intensity  must  be  evaluated.  Stimulus  configurations  that  are  obviously 
related  to  important  natural  situations  might,  for  example,  include:  high  signal 
rates,  multiple  echoes,  and  complex  echo  patterns  of  the  type  reflected  from 
trees.  It  is  thus  necessary  to  consider  such  response  variables  as:  accommoda¬ 
tion  effects,  multiple  response  levels  and  groupings,  response  irregularities,  etc. 
Level  of  excitation  for  a  given  stimulus  unit  also  varies  according  to  surrounding 
conditions,  and  decays  more  or  less  exponentially  in  the  absence  of  significant 
stimuli.  The  most  notable  feature  of  the  response  of  a  given  fiber,  in  relation 
to  stimulus  intensity  as  such,  is  certainly  the  tendency  of  large  response  changes 
to  be  concentrated  in  the  region  relatively  close  to  threshold  excitation. 

SOME  INPUT-OUTPUT  RELATIONS 

FOR  THE  TWO  FIBERS  AT  ONE  SIDE 

Figures  49  and  50  illustrated  response  changes  noted  in  a  given  sensitive 
tympanic  fiber  (Ai)  as  the  pressure  level  of  a  standard  stimulus  was  increased 
by  constant  decibel  increments.  Figure  49  also  showed  the  initiation  of  firing  by 
the  less  sensitive  fiber  (A2).  Evaluation  of  a  moth’s  actual  responses  to  an  ap¬ 
proaching  bat,  however,  is  more  readily  visualized  in  terms  of  equal  increments 
of  distance  of  the  stimulus  source.  This  situation  is  summarized  in  Figure  51 
for  the  four  tympanic  fibers  (two  at  the  near  side  and  two  at  the  far  side — 
assuming  the  moth  to  be  flying  across  in  front  of  the  bat).  As  in  Figures 
49  and  50,  the  response  is  shown  in  terms  of  “excitation  triangles,”  where  the 
height  of  the  triangle  is  one  less  than  the  number  of  spikes  in  the  group,  and 
the  length  of  the  base  designates  duration.  Though  the  illustration  assumes  a 
brief  stimulus  pulse  of  constant  length  and  frequency  composition,  this  idealiza¬ 
tion  probably  would  not  result  in  response  trends  by  the  moth  very  significantly 
different  from  those  shown. 

The  response  trends  for  the  two  fibers  at  the  near  (left)  side  are  shown  by 
the  triangles  above  the  reference  axes,  while  the  response  trends  for  the  two 
fibers  at  the  far  (right)  side  are  indicated  by  the  triangles  below.  Solid  lines  are 
used  for  the  more  sensitive  ( A x)  fibers  and  broken  lines  for  the  less  sensitive 
(A2)  fibers.  If  one  compares  the  response  trends  in  the  two  near-side  (left) 
fibers,  as  a  function  of  the  distance  of  the  source,  it  is  clear  that  the  second 
fiber  would  be  firing  only  over  about  half  the  total  range  to  which  the  first  fiber 
would  be  responding.  Moreover,  the  build-up  of  excitation  in  the  second  fiber 
would  be  much  faster.  As  indicated  earlier,  saturation  would  occur  in  both 
fibers  while  the  bat  was  still  at  some  distance. 

The  sound  pressure  line  of  Figure  51  is  given  for  one  frequency,  40  kc, 
corresponding  to  an  attenuation  rate  by  the  atmosphere  of  roughly  one-third 
to  one-quarter  of  a  decibel  per  foot.  The  increasing  rate  of  attenuation  with 
higher  frequencies  is  illustrated  in  Figure  52  and  tabulated  in  Table  10  of 
Reference  8.  A  large  portion  of  the  energy  in  the  signals  of  Eptesicus  (used 
in  response  distance  studies  by  Roeder  and  Treat)  lies  in  the  20  to  40  kc 
region,  while  the  energy  in  the  signals  of  Lasiurus  and  Myotis  (except  for  the 
buzz  of  Myotis)  is  mostly  concentrated  from  40  kc  to  perhaps  80  or  90  kc. 


112  Man-Machine  Systems 

The  present  value  is  thus  intermediate  between  the  several  bats.  The  absolute 
level  of  the  sound  pressure  line  is  anchored  by  the  levels  of  typical  bat  pulses  as 
given,  for  example,  by  Griffin  (8).  It  is  still  necessary,  however,  to  decide  upon 
a  reasonable  figure  for  the  sensitivity  threshold  of  a  typical  flying  moth.  A  value 
of  0.001  dynes/cm2  is  probably  not  too  far  from  the  threshold  level  of  some 
moths.  This  is  14  db  above  the  nominal  value  of  0.0002  dynes/cm2  for  the  human 
threshold  under  laboratory  conditions,  and  would  give  a  distance  of  detection 
(for  Eptesicus  signals)  of  slightly  over  100  feet,  consistent  with  the  field  ob¬ 
servations  of  Roeder  and  Treat. 

If  one  assumes  that  the  moth’s  nervous  system  is  capable  of  making  certain 
crude  measurements  upon  the  spike  patterns  relayed  by  way  of  the  tympanic 
nerve,  it  is  logical  to  ask  about  the  kinds  of  measures  that  might  be  most 
profitable.  Significant  intensity  levels,  or  shifts  of  level,  might  for  example  be 
given  by  such  measures  as:  1)  the  firing  rates  of  one  fiber,  2)  the  difference  or 
ratio  between  A1  and  A2  firings,  3)  the  difference  in  latencies,  or  4)  group  num¬ 
bers.  Other  kinds  of  measures  might  relate  to  differences  between  the  early  part 
of  a  group  and  a  later  part.  For  example,  the  response  of  a  moth’s  system  to 
direct  and  echoing  signals  is  shown  in  Figure  6B  of  Reference  30.  Even  though 
the  portion  of  a  bat’s  signal  received  directly  might  produce  saturation,  por¬ 
tions  reflecting  off  nearby  objects,  or  the  ground,  might  provide  usable  indica¬ 
tions  as  to  the  distance  (and  possibly  direction)  of  the  object  or  surface.  Other 
groupings  of  significance  might  be  related  to  the  phase  of  pursuit  of  a  near-by 
bat  (see,  for  example,  Figures  4  and  5  of  Reference  31).  Here,  the  appropriate 
measures  might  act  to  trigger  specific  evasive  tactics  or  to  set  off  (or  shift  in 
character)  moth-emitted  pulses  which  act  to  cause  cessation  of  the  bat’s  pursuit. 


RELATIONS  BETWEEN  TYMPANIC  SPIKE 
PATTERNS  AT  THE  TWO  SIDES 

It  is  generally  assumed  that  bilaterality  in  sense  organs  is  chiefly  related  to:  a)  di¬ 
rectional  determinations,  b)  range  determinations,  c)  evaluation  of  motion,  and 
d)  reliability,  speed,  and  precision  of  spatial  evaluations  in  general.  For  the  sound- 
sensitive  moth,  the  main  contribution  of  the  bilateral  receiving  systems  appears 
to  be  directional  evaluation  in  terms  of  azimuth.  The  lighter  lines  below  the 
reference  axes  in  Figure  51  represent  the  response  of  the  fibers  at  the  far  side. 
(Some  connecting  lines  have  been  omitted  for  clarity.)  The  trends  shown  are 
at  best  only  very  approximate  but  give  some  idea  of  the  kinds  of  differences  that 
may  occur  between  near  and  far  sides.  The  most  obvious  difference,  perhaps,  is 
the  increased  latency  near  threshold  for  each  type  of  fiber.  Firing  rate  and  group 
number  may  also  be  significantly  different  at  the  two  sides  at  low  levels  of  the 
signal;  but  high  signal  levels  appear  to  abolish  differential  directional  indi¬ 
cations. 

Examples  of  left-right  differences  in  the  response  to  red  bats  flying  outdoors 
are  given  in  Figure  7  of  Reference  30.  Concerning  the  observed  relations  the 
authors  advance  the  following  interpretation: 


113 


Bat  and  Ultrasonic  Principles  II 

“If  it  is  assumed  that  the  bat  is  first  detected  at  100  feet  and  approaches 
on  a  straight  path  at  right  angles  to  the  moth’s  course  .  .  .  the  differential 
tympanic  nerve  response  would  diminish  throughout  the  approach  and 
disappear  completely  when  the  bat  was  15  to  20  feet  away”  (p.  144). 

The  moth’s  ears  are  also  individually  sensitive  to  the  direction  of  ultrasound 
(Reference  32,  Figure  6);  and  hence  a  moth  with  only  one  functioning  tympanic 
system  (as  often  occurs  with  invasion  by  mites)  may  gain  at  least  primitive 
directional  indications. 

RELATIONS  INVOLVING  THE  RESPONSE  OF 
INTACT  MOTHS  TO  BAT-TYPE  SOUNDS 

The  value  of  directional  data  to  the  moth,  as  already  mentioned,  may  be  primarily 
that  it  enables  the  moth  to  turn  away  from  the  bat,  thus  greatly  reducing  its  own 
reflectivity.  No  reliable  estimates  currently  exist  as  to  how  far  away  a  bat  can 
detect  an  average  flying  moth.  Common  guesses  range  over  a  factor  of  10,  i.e., 
from  5  to  50  feet  (about  Wi  to  15  M).  Two  things  are  certain,  however:  1) 
that  a  moth  can  normally  hear  an  approaching  bat  before  the  bat  can  detect 
echoes  from  the  moth,  and  2)  a  moth  flying  with  its  axis  parallel  to  the  direction 
of  the  bat  cannot  be  detected  nearly  as  far  away  as  one  that  is  flying  crosswise. 
Since  it  is  doubtful  that  a  moth  has  any  memory  of  a  bat’s  direction,  and  since 
the  moth’s  system  appears  to  saturate  (and  abolish  directional  sensitivity)  when 
a  bat  (other  than  a  soft-signalled  bat  like  Plecotus )  comes  near,  the  moth  is 
presumably  incapable  of  directional  response  as  the  bat  attacks.  The  frequent 
propensity  of  moths  to  dive  or  spiral  downward  upon  the  close  approach  of  a 
bat  sugests  that  reference  to  gravity  may  be  the  predominant  directional  in¬ 
fluence  when  bats  are  close  at  hand.  Current  evidence  certainly  tends  to  support 
the  hypothesis  that  directional  response  occurs  when  a  bat  is  at  some  distance, 
typically  prior  to  a  bat’s  detection  of  the  moth. 

Various  features  of  the  responses  of  intact  moths  to  audible  and  ultrasonic 
frequencies  have  been  tested  (34,  35,  30,  27),  including:  initiation  and  cessation 
of  flight,  change  in  wingbeat  pattern,  onset  of  evasive  action,  initiation  and 
cessation  of  click  generation,  and  other  actions.  Measured  response  times 
ranged  for  the  most  part  between  about  1/12  and  1/3  sec,  (even  up  to  one 
second),  response  times  for  typical  night  temperatures  being  rather  longer  than 
response  times  at  daytime  temperatures.  High  intensities  of  sound  produced 
much  quicker  responses,  with  less  temporal  variability,  than  did  low  intensities. 
By  directing  bat-type  pulses  at  flying  moths,  Roeder  observed: 

“Moths  turned  away  from  the  sound  source  when  they  were  at  some 
distance  or  the  sound  intensity  was  low.  Flight  on  a  relatively  straight 
path  was  noted  in  an  upward,  lateral,  or  downward  direction,  de¬ 
pending  on  the  position  of  the  moth  relative  to  the  sound  when  the 
ultrasonic  pulse  train  was  initiated”  (27,  p.  56). 

In  contradistinction  to  these  “directional  maneuvers,”  “nondirectional  maneu¬ 
vers”  were  noted  at  higher  intensities,  including: 


1 1 4  Man-Machine  Systems 

“passive  fall  with  folded  wings,  a  power  dive,  an  alternation  of  passive 
fall  and  power  dive,  sharp  turns  and  loops  followed  by  passive  fall  or 
power  dive,  a  continuous  series  of  tight  turns,  and  other  complex 
maneuvers”  (27,  pp.  56-57). 

In  a  very  limited  set  of  sample  tests,  conducted  with  the  help  of  T.  F.  Gregg, 
moths  were  tethered  to  the  ceiling  of  the  bat  flight  room  by  use  of  long  fine 
wires  (0.001  inch  or  0.002  inch  diameter)  while  bats  made  attempts  to  capture 
them.  The  tethered  moths  initiated  many  of  the  maneuvers  described  by  Roeder, 
sometimes  as  a  function  of  the  bat’s  distance  or  phase  of  pursuit  but  sometimes, 
apparently,  in  some  relation  to  the  duration  or  constancy  of  a  bat’s  close  range 
activity,  suggesting  some  stored  representation  of  the  bat’s  near-by  presence. 

From  tests  such  as  those  mentioned  in  this  section  there  is  certainly  evidence 
that  moths  are  incapable  of  the  quickly  initiated  and  accurately  directed  responses 
so  typical  of  the  bat.  The  moth,  for  its  escape,  appears  to  depend  heavily  upon: 
1)  unpredictability  in  the  specific  timing  and  direction  of  its  maneuvers,  2) 
sharp  turning  radius  at  appreciable  flight  speeds  (perhaps  often  5  to  10  feet  per 
second),  3)  loops  or  spirals  which  may  tend  to  cause  hunting  in  the  bat’s 
predictive  procedure,  and  4)  diving  for  protective  areas  such  as  the  grass.  The 
moth  cannot  outfly  the  bat,  nor  escape  reliably  by  any  readily  predicted  tactic. 

QUANTITATIVE  SURVEY  OF  BAT-MOTH  ENCOUNTER 

The  final  step  in  this  brief  and  rough  review  of  the  ultrasonic  interrelations  of 
bats  and  moths  is  to  summarize  diagrammatically  some  of  the  main  features  of 
a  representative  pursuit  situation.  Figure  52  gives  a  rather  generalized  and 
idealized  quantitative  view  of  such  an  encounter.  The  bat  ( Lasiurus  borealis ) 
is  shown  approaching  uniformly  along  the  Z-axis,  while  the  moth  is  traversing 
slowly  along  the  direction  of  the  X-axis.  The  sequence  of  X-axes  that  go  out 
from  the  points  of  pulse  emission  by  the  bat,  however,  represent  separate 
expansions  of  the  Z-axis — showing,  toward  the  end  of  the  pursuit,  the  relevant 
time  relations  of  pulses  and  echoes.  The  Y-axis  is  used  for  the  designation  of 
magnitude  (in  this  plot,  of  sound  level).  The  three  dotted  arrows  (A,  B  and 
C)  indicate  possible  response  times,  for  a  typical  moth,  to  different  phases  of 
the  bat’s  pursuit.  Figures  53  and  54  show  final  pursuit  in  greater  detail. 

As  indicated  earlier,  the  moth’s  specific  evaluations  of  the  bat’s  position 
tend  to  deteriorate  and  perhaps  disappear  as  the  bat  approaches,  whereas  just 
the  reverse  is  true  of  the  bat.  The  bat’s  reliability  of  detection  and  its  speed 
and  precision  of  interception  increase  markedly  as  it  gets  close  to  its  target. 
There  is  thus  an  enormous  discrepancy  between  the  region  of  maximal  acousti¬ 
cal  effectiveness  on  the  part  of  the  moth  as  compared  with  that  of  the  bat. 
Logically,  the  moth’s  advantage  would  thus  seem  to  lie  in  its  capacity  to  prepare 
for,  and  perhaps  forestall,  the  close  approach  of  the  bat.  Yet  moths  keep 
flying,  even  in  the  presence  of  many  circling  bats.  Often  they  appear  to  rely, 
for  escape,  upon  some  last-instant  tactic,  suggesting  that  certain  triggering 
mechanisms  may  act  under  relatively  high  levels  of  the  bat’s  signals.  A  few 


115 


Bat  and  Ultrasonic  Principles  II 

moths  appear  to  have  evolved  a  special  bag  of  tricks:  the  production  of 
ultrasonic  pulses  themselves.  How  these  pulses  act  upon  the  different  kinds  of 
bats,  and  what  they  may  signify  in  terms  of  warning,  deception,  or  confusion 
remains  to  be  discovered.  The  evidence  cited  suggests  that  they  may  sometimes 
be  very  effective.  Undoubtedly  many  aspects  of  this  intricate  game  between 
pursuer  and  pursued  remain  to  be  discovered;  and  surely  the  true  significance 
of  the  many  patterns  in  the  existing  picture  may  take  a  very  long  time  to 
unravel. 


Figure  39  Close  Pursuit  of  a  Moth:  Physical  Situation.  The  bat,  a  Myotis  liicifugus, 
is  closely  approaching  a  moth  (which  is  flying  a  course  roughly  at  right  angles  to 
the  path  of  the  bat).  A  terminal  pulse  of  approximately  one-third  millisecond  dura¬ 
tion  is  shown,  approximately  to  scale,  starting  to  echo  back  from  the  moth.  No 
attempt  is  made  to  show  the  exact  energy  radiation  pattern  of  the  signal.  Some 
energy  is  actually  dispersed  in  all  directions. 


M an-Machine  System s 


1 16 


L _ I 

IO-"'  (1.5  ft.  tor  Pct'«  path) 

Figure  40  Vector  Representation  of  Figure  39.  The  horizontal  line  of  arrows  in¬ 
dicates  the  approach  path  of  the  bat  (at  15  feet  per  second)  while  the  vertical  line 
of  arrows  designates  the  course  of  the  moth  (at  IVi  feet  per  second).  Each  arrow 
segment  represents  one-tenth  second  (corresponding  to  IVi  feet  of  travel  for  the 
bat,  and  3/4  foot  for  the  moth).  A  typical  set  of  echolocating  pulses  is  shown  along 
the  path  of  the  bat.  Box  shows  area  included  in  Figure  39  with  small  arrow  marking 
the  pulse.  (In  the  laboratory,  the  courses  of  bat  and  moth  are  seldom  straight,  as 
indicated  here,  but  out  of  doors  they  often  approximate  it:  see  Figure  27.) 


Figure  41  Bat  and  Moth  Pulses.  Columns  A  and  C  (white  on  black)  were  made  outdoors  at  Tyringham,  Massachu¬ 
setts,  while  Column  B  was  made  in  the  laboratory.  Tracing  A-l  shows  the  cruising  pulse  of  an  unknown  bat.  presumably 


a  Myotis,  which  was  circling  in  the  area.  Interpersed  with  some  of  the  pulses  of 
this  bat  were  groups  of  short  pulses — usually  five  in  number — which  appeared  to  be 
ultrasonic  clicks  or  pulses  of  an  arctiid  moth  which  was  also  flying  in  the  area.  A 
typical  group  is  shown  in  A-2.  The  output  of  a  meter  measuring  intervals  between  the 
zero  crossings  of  the  pulse  carrier  appears  in  all  tracings.  For  the  bat  pulses  it  shows 
as  a  sloping  line  (or  row  of  dots),  the  initial  lower  end  of  the  line  representing  short 
zero-crossing  intervals  (hence,  higher  frequencies),  the  terminal  upper  end  of  the 
line  representing  lower  frequencies.  The  moth  pulses  have  no  significant  frequency 
sweep,  and  the  meter  output  appears  as  a  cluster  of  dots  which  correspond,  in  the 
present  record,  to  an  average  frequency  of  almost  exactly  80  kc/sec.  The  individual 
pulses  of  the  moth  are  approximately  one-sixth  millisecond  in  duration,  the  whole 
group  normally  occupying  about  two  milliseconds.  (Time  markings  at  the  bottom 
of  the  tracings  are  either  square  waves  or  pulses  derived  from  a  1  kc  oscillator.) 
The  smaller  scale  tracing,  A-3,  is  a  simultaneous  recording  of  one  of  the  cruising 
bat  pulses  and  a  group  of  clicks  or  pulses  from  the  moth.  Several  groups  of  clicks 
preceded  the  one  shown  here,  the  moth  apparently  sending  out  successive  groups 
of  pulses  while  the  bat  was  near  by.  No  specific  time  relations  were  in  evidence,  how¬ 
ever,  between  the  timing  of  the  bat’s  pulses  and  the  moth’s  clicks. 

Column  B  illustrates  typical  changes  in  pulses  form  occurring  as  a  Myotis  lucifugus 
executes  a  successful  pursuit.  The  unparenthesized  numbers  indicate  approximate 
pulse  duration,  while  the  parenthesized  figures  designate  pulse  frequency  range.  The 
output  of  the  zero-crossing  meter  forms  a  relatively  straight  line  for  most  of  the 
pulses,  suggesting  a  hyperbolic  type  of  frequency  drop  (since  equal  ordinate  distances 
represent  reciprocals  of  frequency,  50  kc  being  1  cm  above  100  kc,  25  kc  being  2  cm 
above  50  kc,  and  so  on).  With  the  pulses  of  Myotis  lucifugus  the  starting  frequency 
falls  markedly  during  the  transition  from  cruise  to  pursuit.  Rate  of  frequency  sweep 
is  greatly  increased  as  the  pulse  length  shortens,  total  frequency  sweep  remaining  at 
first  more  or  less  constant,  then  declining.  Pulse  repetition  frequency  during  the 
terminal  buzz  is  182  pulses  per  second. 

Column  C  shows  pulses  of  Lasiurus  borealis  in  the  field  during  the  transition  from 
cruise  to  initial  pursuit  (final  pursuit  pulses  became  too  weak  in  this  recording  for 
proper  reproduction).  The  cruise  or  search  pulses  of  a  circling  red  bat  usually  range 
from  7  to  1 1  milliseconds  in  duration  and  show  some  variations  of  frequency  range. 
Unlike  the  case  with  Myotis  lucifugus,  however,  the  transition  to  pursuit  is  ac¬ 
companied  by  relatively  little  decline  in  initial  pulse  frequency.  In  this  record,  the 
initial  frequency  remained  at  about  70  kc,  in  contrast  to  the  drop  down  to  roughly  30 
kc  often  noted  in  corresponding  Myotis  records.  The  pulse  duration  of  Lasiurus  also 
tends  to  remain  somewhat  longer.  Note  that  the  end  of  the  long  cruising  pulses  is 
essentially  constant  in  frequency. 


Bat  and  Ultrasonic  Principles  II 


119 


Figure  42  Wing  Catch  of  Moth  by  Myotis  lucifugus.  This  picture  shows  an  actual 
encounter  not  too  different  at  its  terminal  phase  from  the  hypothetical  illustration 
of  Figure  39.  Many  such  pictures  were  made  by  catapulting  the  moth  from  below 
as  the  bat  approached.  Characteristically,  the  moth  would  begin  to  fly  as  it  reached 
the  peak  of  the  trajectory,  sometimes  also  initiating  immediate  evasive  maneuvers. 
Use  of  the  wing  for  such  catches  or  attempts  was  very  common,  the  proportion  of 
successful  catches  varying  greatly  from  bat  to  bat,  and  according  to  the  violence  of 
the  moth’s  maneuvers.  In  a  few  instances,  the  tips  of  the  moth’s  wings  were  clipped 
before  catapulting  to  reduce  the  scope  of  the  moth’s  maneuvers  and  keep  the  moth 
within  the  photographic  field. 


120 


Man-Machine  Systems 


Figure  43  Last-Instant  Decision  Not  to  Catch.  The  final  image  shows  the  moth 
diving  down  and  the  bat  continuing  its  flight  course.  (Note  shadow  of  moth  on  bat’s 
wing.) 


Bat  and  Ultrasonic  Principles  n 


121 


Figure  44  Catch  and  Drop  of  a  Moth.  The  second  image  shows  the  bat  dropping 
the  moth  following  wingtip  capture. 


Figure  45  Some  Message  Paths  in  Bat-Moth  Encounters. 


This  highly  schematic  view  shows  a  bat  approaching  a  sound-emitting  moth  in 
the  presence  of  various  other  near-by  objects:  the  foliage  of  a  tree  or  bush,  a 
falling  leaf,  and  a  mosquito  (at  closer  range  than  the  moth).  The  view  is  in¬ 
tended  merely  to  suggest  some  of  the  kinds  of  evaluations,  discriminations,  selec¬ 
tions,  and  resolutions  typical  of  many  interception  situations.  Often  only  a  very 
small  portion  of  the  total  ultrasound  reaching  the  bat  represents  the  echo  of  the 
desired  primary  target.  To  achieve  rapid  and  effective  response  to  the  relevant 
echoes,  the  bat  is  presumably  able  to  gate  out  many  more  prominent  sounds  at 
early  processing  levels.  Efferent  pathways  to  the  external  ears,  the  middle  ear  muscles, 
and  the  hair  cells  of  the  inner  ear  (see  efferent  pathways  in  Figure  46)  may  play 
a  role  in  some  of  the  more  peripheral  gating,  selecting,  and  correlating  devices. 
These  can  probably  be  activated  by  very  rapidly  operating  mechanisms,  as  sug¬ 
gested  in  the  diagram  of  the  bat.  Rapid  adjustments  of  the  emitted  signal  may  also 
be  possible.  Related  sensory  impressions  from  vestibular,  kinesthetic,  and  tactile 
systems  must  also  be  integrated  into  the  response  mechanisms  at  various  levels. 
While  some  responses  are  presumably  set  off  very  quickly  on  the  basis  of  returns  from 
a  single  pulse,  others  such  as  those  to  target  path  or  wing  action  depend  on  more 
complex  analysis  over  several  pulses.  During  a  given  test  run  in  the  laboratory  a 
bat  also  shows  a  certain  amount  of  adjustment  or  learning  as  to  the  nature  of  the 
expected  interception  situations.  The  blocks  illustrated  connote  only  the  broadest 
of  such  concepts. 

Part  of  the  signal  to  the  moth  is  shown  reflecting  back  while  a  small  additional 
portion  reaches  the  moth’s  tympanic  organs  (TO).  These  receptive  devices  send 
messages  to  the  sound  receptive  region  of  the  pterothoracic  ganglion.  Almost  noth¬ 
ing  is  known  of  what  happens  in  the  moth’s  nervous  system  beyond  the  afferent 
tympanic  nerve.  Blocks  suggesting  processing  and  activating  mechanisms  are  hypo¬ 
thetical.  Coordinated  motor  actions  of  various  sorts  are,  however,  released  in  response 
to  received  ultrasound,  notably  those  governing  flight  pattern  (as  suggested  by  arrows 
to  the  flight  muscles  [FM])  and  activation  of  the  sound  generating  tymbal  organs 
(SG).  Clicks  or  pulses  from  these  organs  obviously  reach  the  bat  from  the  same 
direction  as  echoes  of  the  bat  pulses  returned  by  reflection  from  the  moth. 


124 


Man-Machine  Systems 


AC  OCA  CMA 


Figure  46  Schematic  View  of  the  Auditory  System  of  an  Insectivorous  Bat. 
This  diagram  is  intended  only  as  a  very  general  and  approximate  guide  to  the 
kinds  of  pathways  and  connections  apparently  existing  in  the  bat’s  auditory  sys¬ 
tem.  In  this  crude  two-dimensional  representation  anatomical  relations  are,  of 
course,  grossly  distorted.  Highly  compact  and  close  together,  many  of  the  centers 
are  actually  so  close  that  they  merge — often,  apparently,  under  the  influence  of 
“neurobiotaxis”  (the  growing  toward  each  other  of  the  most  functionally  related 
elements).  The  details  shown  are  drawn  almost  exclusively  from  Grinnell  (15). 
Heavy  boxes  indicate  mechanisms  or  centers  that  are  strikingly  developed  in  the 
bat  (that  is,  the  microchiroptera,  but  not  the  megachiroptera).  Relative  to  other 
mammals,  the  most  outstandingly  developed  center  in  the  bat,  perhaps,  is  the  nucleus 
of  the  lateral  lemniscus  ( LLN ).  Its  exact  role  is  unknown,  but  presumably  it  has 
important  governing  functions  in  the  rapid  processing  of  auditory  inputs  and  in  the 
manner  of  their  use.  Of  unusual  interest,  also,  is  the  medial  superior  olive  ( MSO ); 
each  of  its  very  large  cells  apparently  receiving  inputs  from  the  cochlear  nuclei  of 
both  sides. 

The  outer  sets  of  boxes  at  the  two  sides  represent  the  peripheral  mechanisms 
respectively  of  the  outer  ear  ( OE ),  the  middle  ear  (ME),  and  the  inner  ear  (IE). 
Passing  centrally  from  the  inner  ear  is  the  auditory  nerve  which  incorporates  the 
spiral  ganglion  (where  the  cell  bodies  of  the  first  order  neurones  lie).  Three  efferent 
pathways  reading  out  to  the  peripheral  mechanisms  are  shown:  (1)  the  olivo¬ 
cochlear  nerve,  passing  out  the  auditory  nerve,  where  it  apparently  has  terminations 
on  the  hair  cells  of  the  basilar  membrane,  (2)  the  motor  nerve  governing  the  middle 
ear  muscles  (Tensor  tympani  and  Stapedius),  and  (3)  the  motor  nerves  specifying 
the  position,  action,  and  configuration  of  the  outer  ear. 

The  auditory  nerve  splits  into  three  branches  which  go  respectively  to  the  three 
divisions  of  the  cochlear  nucleus:  the  oral  ventral  cochlear  nucleus  (OVCN),  the 


125 


Bat  and  Ultrasonic  Principles  n 

caudal  ventral  cochlear  nucleus  ( CVCN ),  and  the  more  separated  dorsal  cochlear 
nucleus  ( DCN ).  Precisely  how  the  different  categories  of  ascending  fibers  disperse 
within  the  cochlear  nuclei  is  unknown  and  likewise  just  how  they  emerge  to  form 
the  ascending  paths  of  the  central  auditory  system.  Fifty  cell  types  have  been 
identified  in  the  cochlear  nuclei.  Most  fast-conducting  fibers  appear  to  cross  to  the 
contralateral  olivary  complex,  while,  for  the  most  part,  fine  and  more  slowly-con¬ 
ducting  fibers  ascend  ipsilaterally.  Ipsilateral  distribution  is  multiple:  to  the  pre- 
olivary  nuclei  (PON),  to  the  medial  superior  olive  (see  above),  to  the  nucleus  of 
the  lateral  lemniscus,  and  perhaps  elsewhere.  Contralateral  distribution  appears 
mostly  to  the  trapesoid  nucleus  ( TN ) — perhaps  also  to  a  small  adjacent  nucleus 
(lateral  trapesoid  nucleus)  identified  only  in  certain  bats — and  to  the  medial  superior 
olive,  mentioned  above.  The  dorsal  cochlear  nucleus  sends  a  bundle  to  the  vermis 
of  the  cerebellum  ( CBL ),  and  it  receives  an  efferent  nerve  of  uncertain  origin  (per¬ 
haps  from  the  reticular  formation). 

Various  connections  appear  to  go  to  and  from  the  so-called  “secondary  nuclei” 
(olivary  complex  and  nuclei  of  the  lateral  lemniscus — including  in  some  bats  an 
additional  dorsal  nucleus  close  to  the  posterior  colliculus).  Ascending  pathways, 
presumably  with  both  unilateral  and  bilateral  representations,  pass  to  the  lateral 
lemniscus  nuclei  of  both  sides  and  to  the  ipsilateral  posterior  colliculus  (PC).  Con¬ 
nections  are  also  made  with  the  contralateral  olivary  complex  and  with  the  central 
motor  systems  (notably  the  pontine  nuclei,  PN)  mediating  coordination  and 
equilibrium.  Important  ascending  pathways  go  from  the  nucleus  of  the  lateral 
lemniscus  to  the  ipsilateral  posterior  colliculus. 

The  posterior  colliculi  (PC)  in  bats,  though  not  showing  the  greatest  hypertrophy 
relative  to  other  mammals,  are  the  largest  nuclei  of  the  auditory  system.  They 
receive  mostly  third-  and  fourth-order  fibers,  primarily  contralateral,  interchange 
many  fibers  with  the  contralateral  colliculus,  and  have  numerous  connections  with 
the  motor  system.  Presumably  they  play  a  very  dominant  role  in  the  complex  audi¬ 
tory  analyses  that  govern  oriented  behavior. 

The  higher  auditory  centers,  the  medial  geniculate  (MG)  and  auditory  cortex 
(AC),  are  “both  absolutely  relatively  much  smaller  than  the  posterior  colliculi”  (15, 
p.  26).  Ascending  pathways,  of  lesser  size  than  those  leading  to  the  colliculi,  go  to 
these  centers.  Connections  go  to  other  cortical  areas  (OCA)  and  to  the  motor 
system  (MC  =  motor  cortex).  The  cortex  in  bats  may  play  an  important  role  in 
learning  and  the  various  adjustments  and  variations  seen  in  their  responses  during 
tests  on  interception  performance.  That  bats  often  take  many  trials  to  learn  simple 
things  may  be  related  to  the  very  limited  cortical  capacity  of  the  brain. 

The  heavy  central  region  represents  the  descending  motor  system  (desc.  mot.  sys.). 
No  differentiation  into  pathways  or  centers  (other  than  as  already  mentioned)  has 
been  indicated.  Broken  lines  drawn  from  the  ascending  centers  to  the  motor  system 
are  intended  only  to  show  that  connections  to  the  motor  system  exist  at  various 
levels — some  of  these  connections  presumably  being  of  much  importance  in  the 
rapid  early  responses  typical  of  a  bat’s  interception  behavior. 

The  dotted  region  in  the  center  represents  the  brain  stem  reticular  formation 
(RF).  Incorporating  connections  of  many  types,  this  system  may  serve  a  number 
of  functions.  It  relates  sensory  and  motor  systems  both  locally  and  in  more  general 
ways,  apparently  serving  to  inhibit  or  facilitate  the  transmission  of  specific  messages. 
In  higher  forms,  it  appears  to  mediate  aspects  of  attentional  focus  and  motivational 
propensity. 

Descending  broken  lines  (drawn  at  the  sides  for  clarity  only)  indicate  descending 


126  Man-Machine  Systems 

central  control  of  the  ascending  pathways,  as  discussed  in  several  papers  by  Galambos. 
The  extent  of  such  pathways  and  precisely  how  they  operate  remains  as  yet  largely 
unknown,  but  they  may  well  serve  in  part  to  govern  the  selective  processing  mech¬ 
anisms  that  keep  the  potentially  expanding  complexities  of  interception  within  bounds. 


Figure  47  Schematic  Representation  of  Tympanic  System  of  Noctuid  Moth,  For 
the  sake  of  simplicity  various  omissions  or  distortions  are  made.  For  example,  the 
hood,  which  partially  covers  the  tympanic  opening,  is  omitted;  and  the  supporting 
Biigel  and  ligament  within  the  tympanic  air  sac  are  only  suggestive.  Also  omitted  is 
the  “B-cell”  of  the  tympanic  sensillum,  which  may  serve  some  such  proprioceptive 
function  as  transmitting  impulses  in  relation  to  position  or  action  of  the  wings.  (See 
text  for  details  of  acoustic  function.) 


Bat  and  Ultrasonic  Principles  n 


127 


ULTRASONIC 

PULSER 


SPEAKER 


-•ATTEN- 


-D 


Figure  48  Possible  Arrangement  for  Recording  Tympanic  Reponse.  Pulser  provides 
standard  pulses  resembling  those  of  bats,  the  level  being  varied  by  an  attenuator. 
An  ultrasonic  speaker  is  used  to  transmit  the  pulses  to  (a)  moth  preparation  and  (b) 
ultrasonic  microphone  at  the  same  distance.  Timing  marker  is  indicated  on  the  third 
channel  (see  Reference  29). 


Figure  49  Typical  Tympanic  Responses  for  Two  Stimulus  Levels.  The  top  trace 
shows  how  a  typical  pulse  envelope  might  appear  on  the  oscilloscope  at  15  db  and  at 
30  db  above  threshold  (9).  Representative  changes  seen  with  such  a  stimulus  in¬ 
crease  are  (1)  decreased  latency,  (2)  increased  initial  firing  rate  in  each  spike  group, 
(3)  increased  duration  of  group,  (4)  increased  spike  number  in  group,  and  (5)  be¬ 
ginning  of  response  by  less  sensitive  fiber  (A2).  Since  isolated  spikes  probably  do  not, 
on  the  average,  produce  significant  excitation,  spike  number  is  more  meaningfully 
and  conveniently  represented  as  one  less  than  the  total  number  of  spikes  in  the  group. 


128 


Man-Machine  Systems 


SOUND 

PRESSURE 

LEVEL 

db  dynes /cmt 
14  0.00/ 


24 


34  o.O / 


44 


54  o./ 


64 


74  1.0 


Figure  50  Changes  in  Response  by  One  Fiber  to  Increasing  Signal  Level.  The 
shifts  illustrated  here  are  perhaps  slightly  exaggerated  to  emphasize  the  trends.  Near 
the  threshold  the  chief  changes  are  decreased  latency  and  increased  firing  rate. 
Though  group  number  tends  to  increase  with  increased  firing  rate,  some  records 
show  a  decrease  in  duration  of  group,  which  holds  down  the  total  number.  Further 
increase  in  signal  level  (from  perhaps  20  db  to  35  db  above  threshold)  often  pro¬ 
duces  large  increases  in  duration  and  number,  with  saturation  occurring  at  about 
40  db  above  threshold.  The  effective  response  zone  of  a  given  fiber  thus  tends  to 
run  over  less  than  40  db  of  signal  level  increase. 


Bat  and  Ultrasonic  Principles  n 


129 


MILLISECONDS 

Figure  51  Total  Tympanic  Response  as  a  Function  of  Distance  to  Standard  Source. 
Since  the  level  of  a  uniformly  approaching  sound  increases  only  slowly  when  the 
source  is  at  some  distance  (see  sound  pressure  curve),  the  threshold  response 
changes,  which  seem  rapid  in  the  representation  of  Figure  50,  now  appear  much 
more  gradual.  The  response  trends  of  the  more  sensitive  fibers  ( Ai )  are  indicated 
in  solid  lines  (left,  above  reference  axes;  right,  below),  while  the  trends  of  the  less 
sensitive  fibers  ( A 2)  are  indicated  with  broken  lines  (some  connecting  lines  being 
omitted  for  clarity).  For  the  values  used  in  the  present  illustration,  it  is  clear  that 
the  most  striking  shifts  occur  as  the  source  approaches  from  a  distance  of  about  40 
feet  to  a  distance  of  about  20  feet.  Note,  however,  that  the  moth’s  motor  response (s) 
might  not  occur  for  a  half-second  or  more  after  this,  during  which  time  a  red  bat 
may  approach  15  feet  closer. 


Figure  52  Bat-Moth  Encounter:  General  Quantitative  View. 


Z-axis:  The  approach  path  of  the  bat  is  indicated  by  the  heavy  line,  which  also 
represents  the  Z-axis.  Flight  speed  is  assumed  constant  at  24  feet  per  second, 
seconds  from  interception  being  indicated  by  vertical  (T-axis)  lines.  A  repre¬ 
sentation  of  the  bat  itself,  approximately  to  scale,  is  given  close  to  each  time  line. 
Pulse  sequences  of  the  sort  obtained  in  various  recordings  of  red  bats  in  the 
wild  are  shown  along  the  upper  side  of  the  flight  line,  the  pulses  appearing  as  small 
spikes.  These  pulses  also  appear,  in  expanded  representation,  off  to  the  right  along 
the  Z-axis  direction. 

Y-axis:  The  T-axis  specifies  the  sound  pressure  level  reaching  the  moth  as  a 
function  of  the  distance  of  the  bat.  The  line  labelled  Uncor.  Sound  Level  follows  the 
theoretical  inverse  square  function  with  no  allowance  for  atmospheric  attenuation. 
The  three  lines  below  the  uncorrected  line  indicate,  respectively,  the  approximate  at¬ 
tenuation  functions  for  rates  of  Vx  db/ft  (25-30  kc);  Vi  db/ft  (35-40  kc);  and 
1  db/ft  (85-95  kc).  The  exact  rates  of  attenuation  vary  with  conditions  (e.g.,  tem¬ 
perature  and  humidity).  For  a  given  signal  intensity,  bats  that  distribute  most  of  the 
energy  in  the  high  frequency  range  (e.g.,  Lasiurus )  might  thus  not  be  detected  until 
they  were  within  50  feet,  while  bats  using  lower  frequencies  (e.g.,  Eptesicus )  might 
be  detected  at  over  100  feet. 

X-axis:  Essentially,  the  Z-dimension  is  used  for  expansion  of  sections  of  the 
Z-axis  to  a  scale  that  permits  more  detailed  observation  of  the  time  relations  in  the 
interpulse  intervals.  The  set  of  heavy  lines  extending  out  from  the  Z-axis  represent 
the  pulses  being  emitted  by  the  bat.  They  are  shown  varying  in  duration  from  7 
to  11  msec,  and  in  spacing  (except  after  detection  of  the  target)  from  about  70  to  200 
msec.  The  first  line  to  the  right  of  the  bat’s  flight  line  is  labelled  Pulse  to  Target 
and  gives  the  time  required  for  a  given  pulse  to  reach  the  moth  (the  pulse  being 
drawn  again  to  indicate  relative  spacings).  The  next  line,  labelled  Echo  Return, 
designates  the  time  required  for  the  echo  to  return  to  the  bat.  The  last  line,  labelled 
Interval  to  Next  Pulse,  indicates  along  the  Z-axis  scale  the  time  to  the  next  emitted 
pulse  (which  is  also  drawn  in).  Though  stereo  sound  films  of  red  bat  catches  in  the 
wild  have  not  been  made,  existing  films  suggest  that  the  pulse  spacing  is  often  such 
as  to  keep  the  primary  echo  somewhere  near  the  middle  of  the  interpulse  interval. 
However,  there  is  a  suggestion  in  some  of  the  films  that  near  the  beginning  of  the 
buzz  sequence,  secondary  overlap  (overlap  of  echo  from  previous  pulse  with  present 
pulse)  may  sometimes  occur. 

Target  path:  The  segment  extending  to  the  left  of  the  main  Z-axis  represents 
the  path  of  the  target.  This  is  assumed  to  be  a  slow-flying  moth  traversing  at  right 
angles  to  the  path  of  the  bat.  (Time  corrections  due  to  the  moth’s  change  of  position 
are  assumed  negligible.) 


132 


Man-Machine  Systems 


Figure  53  Bat-Moth  Encounter:  Catch  of  Catapulted  Target  by  Wild  Red  Bat — 
Pulse  Repetition  Pattern.  The  heavy  line  shows  the  intervals  between  pulses,  as 
noted  previously  for  laboratory  catches  in  Figures  35  and  36.  Unfortunately  con¬ 
trolled  distance  measurements  were  not  made  with  this  sound-on-film  record,  and 
consequently  the  exact  distance  of  the  bat  from  the  target  could  not  be  determined. 
The  fine  broken  line  indicates  estimated  echo  time  for  the  apparent  path  of  ap¬ 
proach.  In  this  record  the  pulse  interval  pattern  appears  to  be  divided  into  several 
rather  distinct  portions:  (1)  a  rapid  increase  in  rate,  starting  about  one  second  or 
15  to  20  feet  away  from  the  eventual  catch;  (2)  a  plateau  of  about  250  millisecond 
duration  which  occurs  about  a  half-second  from  the  catch;  (3)  another  sudden  increase 
in  rate  just  before  one-third  second  from  the  catch;  (4)  a  tenth-second  transition 
zone  into  the  final  buzz;  (5)  the  terminal  buzz;  and  (6)  a  silent  interval  of  about 
a  tenth-second  just  before  the  catch.  (After  the  catch  there  is  the  usual  long  pause 
while  the  target  is  being  grasped.)  See  next  figure  for  details. 


Bat  and  Ultrasonic  Principles  II 


133 


TO  CATCH  (msec.) 

Figure  54  Bat-Moth  Encounter:  Buzz  Details  of  Catch  in  Figure  53.  There  is  a 
strong  suggestion  in  this  and  other  records  made  in  the  wild  that  the  pulse  spacing 
near  the  transition  zone  into  the  buzz  approximates  fairly  closely  the  echo  time  to 
the  target.  In  contrast  to  the  constant  buzz  spacing  typical  of  Myotis  lucifugus,  this 
record  does  indeed  show  a  progressive  increase  in  rate,  but  with  a  sudden  small 
jump  in  rate  just  as  the  uniformly  increasing  terminal  portion  starts.  Records  like 
these  raise  the  question  of  whether  the  bat  deliberately  adjusts  the  spacing  to  ap¬ 
proximate  some  desired  relation  between  a  given  echo  and  the  next  emitted  pulse 
while  making  use,  possibly,  of  partial  secondary  overlap.  Note  that  since  the  pulse 
rate  is  increasing,  the  sequence  of  echoes  would  tend  to  arrive  at  a  slightly  slower 
rate  than  the  emitted  sequence  (since  they  are  displaced  one  interval  back).  It  is 
conceivable  that  out  of  such  an  arrangement  the  bat  could  make  some  type  of 
rapid  vernier  measurement  of  range,  particularly  useful  in  the  pursuit  of  fast-moving 
targets  such  as  diving  moths. 


134 


Man-Machine  Systems 
REFERENCES 


1.  Batteau,  D.  W.  The  Mechanism  of  Human  Localization  of  Sounds  with  Applica¬ 

tion  in  Remote  Environments.  Naval  Ordnance  Test  Station,  Contract  No. 
125-(60530)-27872A,  1962. 

2.  Blest,  A.  D.,  T.  S.  Collett,  and  J.  D.  Pye,  “The  Generation  of  Ultrasonic  Signals 

by  a  New  World  Arctiid  Moth,”  Proc.  Roy.  Soc.  (in  press). 

3.  Galambos,  R.,  “The  Avoidance  of  Obstacles  by  Flying  Bats,”  ISIS,  Vol.  4,  No. 

11  (November  1959). 

4.  Galambos,  R.,  and  D.  R.  Griffin,  “Obstacle  Avoidance  by  Flying  Bats:  The 

Cries  of  Bats,”  J.  Exp.  Zool.,  Vol.  89  (1942),  pp.  475-490. 

5.  Gould,  E.,  “The  Feeding  Efficiency  of  Insectivorous  Bats,”  J.  Mammal.,  Vol. 

36  (1955),  pp.  399-407. 

6.  Gould,  E.,  “Further  Studies  on  the  Feeding  Efficiency  of  Bats,”  J.  Mammal., 

Vol.  40  (1959),  pp.  149-150. 

7.  Griffin,  D.  R.,  “Bat  Sounds  Under  Natural  Conditions  with  Evidence  for  the 

Echolocation  of  Flying  Prey,”  J.  Exp.  Zool.,  Vol.  123  (1958),  pp.  435-466. 

8.  Griffin,  D.  R.  Listening  in  the  Dark.  New  Haven,  Connecticut:  Yale  University 

Press,  1958. 

9.  Griffin,  D.  R.,  D.  Dunning,  D.  A.  Cahlander,  and  F.  A.  Webster,  “Correlated 

Orientation  Sounds  and  Ear  Movements  of  Horseshoe  Bats,”  Nature  (in 
press). 

10.  Griffin,  D.  R.,  and  R.  Galambos,  “The  Sensory  Basis  of  Obstacle  Avoidance  in 

Flying  Bats,”  J.  Exp.  Zool.,  Vol.  86  (1941),  pp.  481-506. 

11.  Griffin,  D.  R.,  and  A.  D.  Grinnell,  “Ability  of  Bats  to  Discriminate  Echoes 

from  Louder  Noise,”  Science,  Vol.  128  (1958),  pp.  145-147. 

12.  Griffin,  D.  R.,  J.  J.  G.  McCue,  and  A.  D.  Grinned,  “The  Resistance  of  Bats  to 

Jamming,”  (paper  in  preparation). 

13.  Griffin,  D.  R.,  and  A.  Novick,  “Acoustic  Orientation  in  Neotropical  Bats,”  J.  Exp. 

Zool.,  Vol.  130  (1955),  pp.  251-300. 

14.  Griffin,  D.  R.,  F.  A.  Webster,  and  C.  R.  Michael,  “The  Echolocation  of  Flying 

Insects  by  Bats,”  Anim.  Behav.,  Vol.  8  (1960),  pp.  141-154. 

15.  Grinned,  A.  D.  Neurophysiological  Correlates  of  Echolocation  in  Bats.  ONR 

Technical  Report  No.  30  (NR-301-219),  June,  1962. 

16.  Grinned,  A.  D.,  and  D.  R.  Griffin,  “The  Sensitivity  of  Echolocation,”  Biol.  Bull., 

Vol.  114  (1958),  pp.  10-22. 

17.  Kay,  L.,  “A  Plausible  Explanation  of  the  Bat’s  Echolocation  Acuity,”  Anim. 

Behav.,  Vol.  10  (1962),  pp.  34-41. 

18.  McCue,  J.  J.  G.,  “How  Bats  Hunt  with  Sound,”  Nat.  Geographic,  Vol.  119 

(1961),  pp.  570-578. 

19.  Mills,  A.  W.,  “Auditory  Perception  of  Spatial  Relations,”  in  L.  L.  Clark  (ed.) 

Proceedings  of  the  International  Congress  on  Technology  and  Blindness, 
Vol.  II.  New  York:  American  Foundation  for  the  Blind,  1963. 

20.  Novick,  A.,  “Orientation  in  Paleotropical  Bats,”  J.  Exp.  Zool.,  Vol.  138  (1958), 

pp.  81-154. 

21.  Payne,  R.  Acoustic  Orientation  of  Prey  by  the  Barn  Owl,  Tyto  Alba.  ONR 

Technical  Report  No.  1  (NR-301-549) ,  November,  1961. 

22.  Pye,  J.  D.,  “A  Theory  of  Echolocation  by  Bats,”  J.  Laryng.  Otol,  Vol.  74  (1960), 

pp.  718-729. 

23.  Pye,  J.  D.,  “Echolocation  by  Bats,”  Endeavour,  Vol.  20  (1961),  pp.  101-111. 


Bat  and  Ultrasonic  Principles  n  135 

24.  Pye,  J.  D.,  “Perception  of  Distance  in  Animal  Echolocation,”  Nature,  Vol.  190 

(1961),  pp.  362-363. 

25.  Pye,  J.  D.,  M.  Flinn,  and  A.  Pye,  “Correlated  Orientation  Sounds  and  Ear 

Movements  of  Horseshoe  Bats  (2),”  Nature  (in  press). 

26.  “Vespertillio,”  in  A.  Rees  (ed.)  Rees's  Cyclopaedia,  Vol.  37.  London:  Longman, 

Hurst,  Rees,  Orme,  and  Brown,  1819,  p.  Q-2. 

27.  Roeder,  K.  D.,  “Ultrasonic  Interaction  of  Bats  and  Moths,”  in  E.  E.  Barnard, 

and  M.  R.  Kare  (eds.)  Biological  Prototypes  and  Synthetic  Systems.  New 
York:  Plenum  Press,  1962. 

28.  Roeder,  K.  D.  Nerve  Cells  and  Insect  Behavior.  Cambridge:  Harvard  University 

Press,  1963. 

29.  Roeder,  K.  D.,  and  A.  E.  Treat,  “Ultrasonic  Reception  by  the  Tympanic  Organ 

of  Noctuid  Moths,”  J.  Exp.  Zool.,  Vol.  134  (1957),  pp.  127-158. 

30.  Roeder,  K.  D.,  and  A.  E.  Treat,  “The  Detection  and  Evasion  of  Bats  by  Moths,” 

Amer.  Scientist,  Vol.  49  (1961).  pp.  135-148. 

31.  Roeder,  K.  D.,  and  A.  E.  Treat,  “The  Detection  of  Bat  Cries  by  Moths,”  in 

W.  Rosenblith  (ed.)  Sensory  Communication.  Cambridge:  MIT  Technology 
Press,  1961. 

32.  Roeder,  K.  D..  and  A.  E.  Treat,  “The  Acoustic  Detection  of  Bats  by  Moths,” 

Proceedings  of  the  11th  Entomological  Congress  (in  press). 

33.  Schneider,  H..  and  F.  P.  Mohres,  “Die  Ohrbewegungen  der  Hefeisenfledermause 

(Chiroptera.  Rhinolophidae)  under  Mechanisms  des  Blidhorens,”  Z.  /.  Vergl. 
Physiol.,  Vol.  44  (1960),  pp.  1-40. 

34.  Treat,  A.  E.,  '"Response  to  Sound  in  Certain  Lepidoptera,”  Ann.  Entomol.  Soc. 

Amer.,  Vol.  48  (1955),  pp.  272-284. 

35.  Treat,  A.  E..  “The  Reaction  Time  of  Noctuid  Moths  to  Ultrasonic  Stimulation,” 

J.  N.  Y.  Entomol.  Soc.,  Vol.  64  ( 1956),  pp.  165-171. 

36.  Treat,  A.  E.,  “The  Metathoracic  Musculature  of  Crymodes  Devastator  (Brace) 

(Noctuidae)  with  Special  Reference  to  the  Tympanic  Organ,”  Smithsonian 
Misc.  Col.,  Vol.  137  (1959),  pp.  365-377  and  16  plates. 

37.  Treat,  A.  E.  and  K.  D.  Roeder.  “Electrical  Response  of  the  Noctuid  Tympanum 

to  Ultrasonic  Stimulation,”  Proceedings  of  10th  International  Congress  on 
Entomology,  Vol.  2  (1958),  pp.  117-120. 

38.  Webster.  F.  A.,  “Mobility  Without  Vision  by  Living  Creatures  Other  Than  Man 

(with  Special  Reference  to  the  Insectivorous  Bats),”  in  J.  W.  Linsner  (ed.) 
Proceedings  of  the  Mobility  Research  Conference.  New  York:  American 
Foundation  for  the  Blind,  1962,  pp.  1 10-127. 

39.  Webster.  F.  A.,  “Bat-Type  Signals  and  Some  Implications,”  in  Bennett,  Degan, 

and  Spiegel  (eds.)  Human  Factors  in  Technology.  New  York:  McGraw-Hill, 
1963. 

40.  Webster.  F.  A.  and  D.  R.  Griffin,  “The  Role  of  the  Flight  Membranes  in  Insect 

Capture  by  Bats,”  Amin.  Behav.,  Vol.  10  (1962),  pp.  332-340. 


ACTIVE  ENERGY  RADIATING  SYSTEMS: 


ULTRASONIC  GUIDANCE  FOR  THE  BLIND* 


LESLIE  KAY 

University  of  Birmingham,  Birmingham,  England 


INTRODUCTION 

There  have  been  several  attempts  to  produce  a  guidance  aid  for  the  blind 
based  on  the  use  of  sound  or  electromagnetic  waves  and  covering  a  very 
wide  range  of  frequencies.  The  sound  waves  ranged  from  audible  fre¬ 
quencies  to  about  100  kc/sec,  and  the  electromagnetic  waves  have  covered 
the  radio,  visible  light,  and  ultraviolet  bands  of  the  spectrum;  but  all  systems 
have  been  unsuccessful  in  the  sense  that  they  have  not  come  into  general 
use.  The  results  did  however  lead  to  two  conclusions:  that  devices  em¬ 
ploying  light  would  have  most  chance  of  success,  and  that  the  sensory 
stimulus  should  be  tactile.  A  brief  survey  of  these  devices  by  Busher  (2) 
reflects  the  current  thought  on  the  subject  which  apparently  excludes  almost 
completely  the  possibility  of  ultrasonic  devices  producing  the  required  re¬ 
sults  in  the  future. 

Very  little  work  on  the  use  of  ultrasonics  for  blind  guidance  appears 
to  have  been  reported  since  around  1950  (1).  The  work  described  here 
was  started  at  the  University  of  Birmingham  in  1960,  because  recent  prog¬ 
ress  in  ultrasonic  techniques  and  the  studies  of  the  bat’s  behavior  by 
Griffin  and  others  (3)  during  the  past  decade  led  to  the  belief  that  a  suc¬ 
cessful  ultrasonic  aid  may  now  be  possible. 

The  behavior  of  bats  under  adverse  conditions  has  shown  that  these 
particular  animals  can  make  good  use  of  ultrasonic  waves,  and  recordings 
show  that  much  of  the  information  needed  and  obviously  obtained  by  the 
bat  must  be  obtained  as  a  consequence  of  the  broad  frequency  band  used 
in  the  ultrasonic  emission.  While  it  is  impossible  at  this  stage  to  say  just 


*  The  author  is  pleased  to  acknowledge  the  financial  assistance  given  by  St. 
Dunstan’s  and  the  encouragement  received  from  St.  Dunstan's  Scientific  Committee 
and  the  National  Research  Development  Corporation  in  the  conduct  of  this  work. 


137 


138  Man-Machine  Systems 

how  the  bat  actually  uses  the  information,  there  can  be  no  doubt  that 
adequate  information  is  available  in  the  signal  it  receives.  Two  theories 
have  been  suggested  for  the  reason  why  bats  use  a  wide  frequency  band 
during  the  emission  of  the  ultrasonic  waves  which  is  in  the  form  of  a 
linear  (or  nearly  linear)  sweep  of  frequency  of  up  to  one  octave  in  the 
region  of  50  kc/sec. 

Strother  (9)  suggests  the  system  is  similar  to  that  used  in  “chirp  radar.” 
This  is  basically  a  pulse  system  using  a  frequency-swept  transmission  and 
a  time  delay  in  the  receiving  channel  which  is  a  function  of  the  signal  fre¬ 
quency  being  received.  High  frequencies  are  delayed  longer  than  the  low 
frequencies  so  that  a  change  from  a  high  to  a  low  frequency  during  the 
pulse  transmission  will  result  in  an  echo  signal  in  the  receiver  output  being 
compressed  in  time,  and  the  amplitude  relative  to  the  noise  background 
thereby  increased.  Two  factors  make  this  method  unlikely;  the  resolution 
enjoyed  by  the  bat  shows  that  echoes  from  objects  are  received  be¬ 
fore  the  transmission  has  ceased,  and  there  is  no  satisfactory  evidence 
of  a  time  delay  which  is  a  function  of  frequency  in  the  auditory  neural 
system. 

The  alternative  system  proposed  by  the  author  (5)  and  at  the  same 
time  and  quite  independently  by  Pye  (7)  has  more  attractive  possibilities. 
A  beat  note  produced  during  the  transmission  between  the  transmitted 
frequency  and  that  of  an  echo  will  have  a  pitch  which  is  proportional  to 
the  distance  of  the  object,  and  it  is  physiologically  possible  for  such  a  beat 
note  to  be  produced.  Echoes  received  during  the  transmission  can  there¬ 
fore  be  resolved  in  a  frequency  scale  instead  of  a  time  scale.  It  will  certainly 
be  a  very  long  time  before  either  theory  can  be  proved — or  disproved — 
as  is  evident  from  our  inability  to  explain  even  the  performance  of  our  own 
auditory  system. 

It  is  nevertheless  attractive  to  put  forward  the  hypothesis  that  since  a 
bat  can  “see”  with  ultrasonic  waves,  a  blind  person  should  also  be  able 
to  do  so,  and  therefore  a  copy  of  the  bat’s  system  of  navigation  may  have 
a  very  good  chance  of  success.  This  argument  is  not  necessarily  sound. 
Humans  are  of  a  very  much  more  intelligent  order  and  require  more  from 
life  than  a  bat  if  they  are  to  enjoy  a  satisfying  existence.  The  bat  probably 
concentrates  on  a  single  task  while  a  blind  person  undoubtedly  has  many 
thoughts  passing  through  his  mind  while  mobile,  and  any  really  success¬ 
ful  guidance  aid  should  not  prevent  these.  Many  more  such  reasons  can 
readily  be  found.  The  information  obtained  by  a  bat  through  the  medium 
of  sound  waves  may  not  in  fact  be  adequate  for  the  satisfactory  guidance 


Ultrasonic  Guidance  1 39 

of  blind  humans,  and  it  is  very  clear  that  psychological  factors  are  of 
paramount  importance. 

On  the  other  hand,  the  dependence  on  their  natural  senses  for  mobility 
causes  considerable  stress  to  blind  people,  and  the  relief  of  this  may  be  of 
sufficient  justification  for  the  general  use  of  a  device.  If  the  information 
gathered  can  be  analyzed  subconsciously  by  the  neural  system,  more  than 
this  relief  of  stress  may  ultimately  be  achieved.  Audible  sound  waves 
already  provide  a  medium  whereby  blind  people  can  gather  information 
about  their  surroundings  without  any  conscious  knowledge  of  the  mech¬ 
anism,  and  if  this  can  be  greatly  improved  by  using  ultrasound,  this 
guidance  medium  should  be  fully  exploited. 

THE  FREQUENCY  MODULATION  SYSTEM 

AS  POSSIBLY  USED  BY  BATS 

The  ultrasonic  emission  from  many  species  of  bat  resembles  that  of  the 
transmission  used  for  a  pulsed  frequency  modulated  (FM)  echolocation 
system.  In  Figure  1  it  can  be  seen  that  the  returning  echoes  due  to  a  pulsed 
FM  emission  will  also  be  pulses  of  varying  frequency,  and  a  difference 
frequency — or  beat  note — can  exist  which  is  a  function  of  the  distance 
to  an  object.  The  duration  of  these  difference-frequency  pulses  is  also  a 
function  of  the  distance.  Echoes  received  during  the  actual  transmission 
could  be  detected  by  the  beat  note  produced  in  the  ears,  possibly  in  the 
same  way  as  the  beat  note  produced  when  we  hum  a  note  slightly  dif¬ 
ferent  from  that  of  a  musical  instrument.  A  bat  would  then  detect  objects 
down  to  zero  distance  from  a  maximum  of  3  to  4  feet  depending  upon  the 
duration  of  the  pulses. 

With  such  a  system,  each  reflecting  surface  or  object  within  a  few  feet 
would  produce  its  own  beat  note  having  a  frequency — or  pitch — directly 
related  to  its  distance,  so  that  several  objects  could  be  resolved  by  the 
neural  system.  How  otherwise  can  a  bat  observe  echoes  which  arrive  dur¬ 
ing  the  transmission  period?  A  simple  pulse  system  would  not  provide  the 
same  information. 

Binaural  System 

The  bat  may  also  obtain  better  information  about  the  direction  of  an  ob¬ 
ject,  using  the  frequency  modulation  method  of  distance  measurement,  as 
compared  with  receiving  only  a  simple  pulse.  Since  the  frequency  is  varying 
during  the  transmission,  phase  comparison  between  the  signals  received 
by  the  two  ears  is  not  possible;  without  beat  notes,  therefore,  time  of  ar- 


140 


Man-Machine  Systems 


Figure  1  Information  Available  from  Pulsed  Frequency  Modulation  Echo 
Location.  The  bat  may  have  two  mechanisms  at  its  disposal:  (a)  the  path 
difference  due  to  separation  of  the  bat’s  ears  giving  a  frequency  difference 
A  f  and  consequently  angular  information  with  respect  to  the  line  of  flight; 
(b)  the  distance  to  the  target  giving  frequency  difference  A  /  between 
emission  and  echo. 


Ultrasonic  Guidance 


141 


rival  would  be  the  only  source  of  directional  information.  If  on  the  other 
hand  the  beat  note  is  used,  an  object  to  the  left  or  right  would  produce  beat 
notes  in  the  two  ears  which  have  different  frequencies.  The  difference  is 
nearly  proportional  to  the  angle  over  quite  a  wide  arc  and  is  independent 
of  distance  (5). 

A  system  such  as  that  described  would  be  expected  to  give  a  better 
performance  against  a  noise  background,  and  does  explain  many  of  the 
puzzling  features  of  the  bat’s  echo  location  acuity.  Without  the  beat  note 
effect  the  energy  in  the  echo  pulse  covers  the  same  frequency  band  as  noise 
which  has  been  used  in  experiments  to  upset  the  bats  when  flying  between 
fine  wires  (3).  It  would  be  quite  impossible  for  the  bat  to  detect  the  echoes 
from  such  wires  in  the  presence  of  a  high  noise  level  without  some  such 
form  of  signal  processing  in  its  receiving  channel.  The  beat  notes  have  a 
limited  frequency  band  depending  upon  the  duration  of  the  emission,  and 
the  filtering  effect  of  the  ear  can  then  improve  the  effective  signal/noise 
ratio. 

When  we  hear  a  sound  it  is  usually  related  to  some  spatial  position, 
but  we  do  not  consciously  realize  that  the  sound  is  processed  inside  our 
head.  It  is  only  when  sounds  are  fed  to  our  ears  artificially  via  headphones, 
for  example,  that  they  actually  feel  to  be  inside  the  head.  The  same  ex¬ 
perience  can  be  expected  with  the  bat  during  echolocation;  although  the 
bat  generates  the  sound,  when  the  echoes  are  received  they  must  surely 
appear  to  come  from  the  object  and  a  spatial  sound  picture  will  be  formed. 

This  hypothesis  is  based  upon  the  form  of  transmission  made  by  bats 
and  their  general  behavior,  and  leads  to  a  system  which  could  be  used  by 
blind  humans,  if  correctly  designed  to  meet  the  responses  of  the  human  ear. 

ULTRASONIC  GUIDANCE  AIDS 

Laboratory  Equipment 

Both  pulse  and  frequency  modulation  systems  tried  in  the  past  (1,  5) 
proved  ineffective,  and  some  evidence  that  a  new  system  could  be  much 
more  effective  is  required.  Equipment  was  developed  in  the  early  stages 
of  the  research  program  to  reproduce  some  of  the  effects  obtained  with 
previous  guidance  aids  using  both  pulses  of  tone  and  frequency-modulated 
transmission.  Schematic  diagrams  of  the  equipment  are  shown  in  Figures 
2  and  3.  Wide  band  ultrasonic  transducers  of  the  type  developed  by  Kuhl 
et  al.  (6),  were  used  to  cover  the  frequency  range  required.  They  were 


142 


Man-Machine  Systems 


T«AHSJ>UC6?. 


Figure  2  Schematic  Diagram  of  Pulse  System 


i  pies. 


Figure  3  Schematic  Diagram  of  Frequency  Modulation  System 

not  available  in  the  past  and  it  is  partly  due  to  these  that  encouraging 
results  have  now  been  obtained. 

The  Pulse  System 

The  frequency  of  the  transmitter  oscillator  in  the  pulse  equipment 
could  be  varied  over  a  range  of  frequencies  from  20  kc/sec  to  100 
kc/sec  and  the  drive  to  the  output  stage  was  gated  to  produce  pulses 


Ultrasonic  Guidance 


143 


of  tone  of  varying  duration  from  1  msec  to  20  msec.  The  repetition  rate 
was  adjusted  to  suit  the  maximum  distance  from  which  it  was  required 
to  receive  an  echo.  When,  for  example,  a  distance  of  20  feet  is  required, 
the  time  taken  to  receive  an  echo  from  this  distance  is  approximately  40 
msec.  The  period  between  the  pulses  must  then  be  adjusted  to  80  msec  to 
avoid  ambiguity  in  range.  When  ambiguity  arises  it  is  because  the  ear  is 
unable  to  tell  whether  the  echo  or  the  transmission  is  received  first,  and  the 
time  interval  between  transmission  and  echo  cannot  therefore  exceed  half 
the  transmission  period.  The  received  echo  was  heterodyned  with  a  second 
oscillator  having  a  frequency  which  was  between  one  and  four  kilocycles 
lower  then  the  frequency  of  transmission.  An  audible  note  could  then  be 
heard  in  the  headphones  when  an  echo  was  received.  An  arrangement  was 
provided  for  coupling  the  transmitted  pulse  to  the  receiver  at  an  attenuated 
level  so  that  both  the  transmission  and  the  echo  could  be  heard.  The  audible 
output  to  the  headphones  was  in  the  form  of  either  a  low  frequency  pulse 
of  between  one  and  four  kilocycles  or  a  dc  pulse  obtained  from  the  rectified 
audio  frequency  pulse,  as  shown  in  Figure  4. 


Note  t,  must  nor 

2. 


(cV\  Au1>i8l£  FREOuENcy  Pi/lC^S  Applied  to  hcajj  "Phones 

A 

(D  ]).C.  PulSeS  Appuej>  to  head  "phones 

Figure  4  Pulses  Applied  to  Head  Phones 


Conttoll£P 
Speak  THWouCf 


f 


Tp 


TRajus  mission  via 
Path  I>»st ANC-t  d. 


/ 


The  Frequency  Modulation  System 

The  parameters  of  the  system  can  best  be  understood  from  the  graphs 
of  Figure  5;  it  will  be  observed  that  the  transmission  is  continuous  and 
that  the  sweep  lasts  for  five  times  the  period  between  the  transmission 
pulses  in  the  pulse  system.  The  frequency  sweep  was  from  60  kc/sec 
to  30  kc/sec  and  the  maximum  audible  frequency  was  3  kc/sec.  This 
maximum  frequency  was  chosen  because  the  response  of  the  ear  falls 


144  Man-Machine  Systems 


Time 


-> 


Figure  5  Parameters  of  FM  System 


off  beyond  3  kc  sec;  between  100  cps  and  3  kc/sec  it  rises  by  approx¬ 
imately  40  db  and  is  very  effective  in  correcting  for  the  attenuation 
in  the  air  as  the  range  increases.  A  wide  frequency  band  of  30  kc/sec  in 
the  medium  has  two  advantages.  First,  the  time  required  to  sweep  this  band 
is  400  msec  for  a  maximum  range  of  20  feet  and  an  echo  therefore  becomes 
an  almost  constant  tone;  quality  then  becomes  apparent.  For  example, 
an  echo  from  20  feet  will  last  from  40  to  400  msec.  A  short  break  of 
40  msec  maximum  has  little  effect  on  the  quality  of  the  echo  pitch.  Second, 
the  frequency  band  in  the  medium  should  be  as  large  as  possible  if  informa¬ 
tion  about  the  reflecting  surface  is  to  be  obtained.  For  instance,  changing 
the  wavelength  by  a  factor  of  2  to  l  produces  a  modulation  on  the  echo 
waveform  at  the  audio  output  when  the  reflecting  surface  has  two  or  more 
reflecting  points  separated  in  range  by  about  one  wavelength.  The  pattern 
changes  throughout  the  sweep  period. 

To  reproduce  the  FM  arrangements  used  in  earlier  guidance  aids,  a  gate 
was  included  in  the  transmitter  circuit  to  produce  pulses  of  frequency- 
modulated  signal,  and  the  repetition  rate  was  adjusted  accordingly.  One 
arrangement  was  to  allow  the  transmitter  to  sweep  a  band  of  3  kc/sec 
over  a  period  of  40  msec,  followed  by  an  “off"  period  of  40  msec  to 
eliminate  ambiguity  in  range. 


Ultrasonic  Guidance 


145 


By  adjusting  the  frequency  sweep  rate,  shorter  maximum  ranges  than 
20  feet  were  also  obtainable.  The  highest  sweep  rate  gave  a  maximum 
range  of  5  feet. 

TRANSDUCERS 

Dielectric  microphones  of  the  type  described  by  Kuhl  were  used,  giving  a 
uniform  over-all  response  up  to  70  kc/sec.  The  beam  of  the  transmit/ 
receive  arrangement  was  about  10  degrees  at  50  kc/sec  but,  of  course, 
varied  with  the  frequency.  This  beam  of  10  degrees  was  chosen  initially  to 
obtain  high  sensitivity  and  satisfactory  angular  resolution.  A  very  narrow 
beam  would  be  difficult  to  use  and  too  wide  a  beam  may  have  given  too 
little  sensitivity.  Later  developments  show  that  wider  beams  can  be  used 
with  some  advantages. 

LABORATORY  TESTS 

Tests  were  carried  out  with  a  few  sighted  colleagues  to  determine  the  ap¬ 
proximate  relative  performance  of  the  systems.  The  subjects  were  not 
trained  in  any  way  and  could  not  therefore  judge  distance  absolutely  from 
the  sounds  they  heard.  They  were  only  asked  to  state  the  order  of  merit 
in  which  they  would  place  the  systems  on  the  basis  of  the  information  they 
received  from  the  signals. 

Initially  the  transmitter  was  separated  from  the  receiver  and  pointed 
at  it  from  various  distances.  Only  one  “echo”  was  then  received.  The  re¬ 
sults  are  discussed  briefly  below. 

Pulse  System  with  No  Direct  Coupling  to  the  Transmitter 

Pulses  could  be  clearly  heard  but  there  was  no  indication  of  the  distance 
from  which  the  signal  was  being  received.  This  applied  equally  to  audible 
frequency  pulses  and  dc  pulses  in  the  headphones.  Two  headphones  were 
used  in  all  the  tests  to  balance  the  auditory  system,  although  this  is  not 
absolutely  necessary. 

Pulse  System  with  Direct  Coupling  to  the  Transmitter 

Audible  Frequency  Pulses.  Both  the  transmitter  pulse  and  the  echo 
pulse  were  adjusted  to  be  of  the  same  amplitude.  The  subject  could 
tell  the  difference  between  the  sounds  produced  by  the  transmission 
pulse  alone  and  the  transmission  and  “echo”  pulses  together,  but  could 
not  say  just  how  close  they  were.  Thus  he  was  unable  to  judge  dis¬ 
tance.  There  was  a  faint  sensation  of  hearing  a  note  with  the  click, 


P£j-Am\J6-  AmVliTVAG 


146  Man-Machine  Systems 

the  pitch  of  which  varied  with  the  distance  to  the  transmitter.  Once 
the  attention  is  drawn  to  this,  the  pitch  can  be  judged,  but  only  with 
some  difficulty.  The  effect  is  reduced  as  the  relative  amplitudes  of  the 
two  pulses  is  changed.  Altering  the  repetition  rate  of  the  pulses  also  alters 
the  pitch.  This  can  be  seen  from  Figure  6.  The  line  spectrum  of  each  series 


trams'1  ^  “ecuo* 


T  > 

Y 

t.-t 


t 


Tine  scale  op  JDtreei — nTANShnss»ors  sr&A*  -tiakooc, h 
AND  TRAn-smi-ssio*  v//a  “PAth  >isrAM£-e  cL. 


of  7T?AM^M  rs^io H  A N>  BCHO  T^UAS&S  T&C?ETH  ER. 
here .  ampl  rru'b  ££  &<pvAu  . 

Figure  6  Spectrum  of  Audio  Frequency  Pulses  from  Which  Distance 
Information  is  to  be  Obtained  by  the  Ear 


Ultrasonic  Guidance 


147 


of  transmission  or  echo  pulses  follows  a  sin  x/x  envelope  and  is  made  up 
of  discrete  frequencies  spaced  1/TP  on  either  side  of  i0,  the  audible  differ¬ 
ence  frequency.  The  phase  between  the  frequency  components  of  the 
transmission  pulses  is,  however,  different  from  that  of  the  echo  pulses,  and 
when  the  two  are  added  a  new  spectrum  envelope  is  produced  having  peaks 
which  move  away  from  the  central  peak,  as  t*  is  reduced.  It  is  suggested 
that  these  peaks  produce  the  same  effect  as  formants  in  auditory  perception, 
and  the  subject  thinks  he  hears  the  fundamental  which  in  fact  is  not  present. 
Since  the  subject  does  not  actually  hear  the  so-called  fundamental,  the 
effect  is  weak.  The  click  due  to  the  pulses  predominates. 

DC  Pulses  Applied  to  the  Headphones.  The  results  were  similar  to  those 
obtained  with  audio  frequency  pulses  but  the  pitch  of  the  note  heard  to¬ 
gether  with  the  clicks  was  much  more  marked.  This  can  be  seen  from  Figure  7. 
The  spectrum  of  the  pulses  now  extends  from  zero  frequency  and  the  peaks 
due  to  the  phase  addition  and  cancellation  of  the  frequency  components 
is  in  the  low  audio  band.  The  first  peak  appears  to  produce  the  note  which  is 
heard,  but  there  is  some  uncertainty  here  because  the  quality  is  so  poor  com¬ 
pared  with  the  tone  for  determining  the  fundamental  frequency.  The  pitch 
could,  however,  be  judged  sufficiently  well  to  give  a  rough  indication  of 
distance.  Even  so,  the  click  was  still  predominant. 

The  Frequency  Sweep  System  Using 

a  Continuous  Transmission 

This  system  is  silent  when  no  “echo”  is  present,  but  immediately  an  echo 
is  received  the  echo  tone  can  be  heard.  (In  the  pulse  systems  the  trans¬ 
mission  click  has  to  be  tolerated  at  all  times.)  The  “echo”  was  in  the  form 
of  a  clear  but  interrupted  note;  when  the  break  in  the  note,  due  to  the  fly¬ 
back  of  the  transmission  sweep,  was  long — 20  milliseconds,  say — there  was 
a  faint  thump  accompanying  the  tone  which  increased  as  the  range  was 
reduced.  The  frequency  of  the  echo  tone  was  also  reduced.  Now,  however, 
the  tone  predominated.  This  can  be  seen  from  Figure  8.  The  tone  is  pro¬ 
duced  by  the  large  components  (maximum  of  2)  of  the  spectrum,  and  the 
“thump”  results  from  the  smaller  side  components. 

The  pitch  was  easy  to  judge — as  easy  as  with  a  pure  tone.  The  pulsed 
FM  arrangement  gave  results  which  lay  between  the  pulse  system  and  the 
continuous  FM  system,  the  thump  becoming  more  predominant  as  the 
duration  of  the  sweep  was  reduced  to  the  point  where  the  silent  period  was 
the  same  as  the  transmission  period. 


148 


Man-Machine  Systems 

jl _ n  _ n _ o _ fi 


'bC,  “Pul  s  £-S  /VT»t>l»ej>  ~ra  Ufa >  'PHone-S. 


OR  >C  T>UL-S£S  y>U  EL  T~o 
T^ANSm/J^N  /4w>  ‘'ecVc0  0p 
/rtfu Ac  #MP>Liri/>E. 

Figure  7  Spectrum  of  DC  Pulses  from  Which  Distance  Information  is 
Obtained  by  the  Ear 

Multiple  Echoes 

The  systems  were  then  used  for  detecting  objects,  and  provided  the  back¬ 
ground  noise  was  negligible  single  objects  could  be  detected  equally  well 
with  both  pulse  and  FM  arrangements.  It  is  only  when  the  signal/noise 
ratio  is  small  that  any  difference  is  noticed,  and  this  affects  both  the  maxi¬ 
mum  range  of  the  system  and  the  smallest  object  which  can  be  detected. 
As  was  expected,  the  FM  system  was  superior  since  continuous  signals  were 


fi?(EU\~n\/eL  AMPliTOPG. 


Ultrasonic  Guidance 


149 


Aupio  F#&QoZNCy  "ecHO* 


Am’rc.itu i>e.  wot>u*.  at/om  cm  £oho  w ith  f  m\  System 


xiso  3oco 

STPECT/^/n  OF  ECHO  y/iTH-  F  M 

('same  scale  as  f&r 


F tfEQuEHCy 
SySTBM  . 


S~PEJ2  T~Rf\L-  LINE  S 


#  \ 


-SlN  VI  fruNCTtoN 


(NOT  OSMETTALLy  CCfNCfPSHC^^ 
wrrH  peak  OF  I 

envELOPC.)  i 

I 

i 

f\  I 

/>  /  u 
A^V.  V 


\ 


*  /h 


i 


prrec^eNcg. 


Fm 


T  —  t 

Rvi  l 


=  A 


■  %*h 


SPecr/?un  oh  £xPANfc«^>  sc/Au£< 


Figure  8  Spectrum  of  Echo  From  FM  System.  (Position  of  spectrum 
on  frequency  scale  gives  distance  information.) 


1 50  Man-Machine  Systems 

being  used,  which  always  give  better  performance  than  pulsed  signals  be¬ 
cause  of  the  difference  in  bandwidth  and  mean  power. 

When  more  than  one  object  was  in  the  field  of  view  the  results  of  the 
various  systems  were  very  different.  Under  realistic  conditions  two  echoes 
are  rarely  of  the  same  amplitude  since  they  are  usually  from  different  sur¬ 
faces,  and  when  this  occurs  the  subject  has  great  difficulty  in  hearing  the 
weaker  echo  with  the  pulse  system.  The  strong  echo  apparently  suppresses 
it.  This  was  not  the  case  with  the  FM  system  except  when  two  echoes  were 
of  almost  the  same  frequency.  When  the  strong  echo  is  only  slightly  modu¬ 
lated  by  the  weak  one  discrimination  becomes  difficult.  Generally,  how¬ 
ever,  two  notes  are  distinctly  heard  and  the  relative  distance  between  the 
two  objects  and  the  distance  to  them  can  be  easily  judged.  A  resolution  of 
better  than  one  foot  at  a  maximum  range  of  ten  feet  was  found  to  be 
possible  with  a  little  practice.  At  shorter  ranges  the  resolution  improves. 

It  will  be  appreciated  that  the  subject  was  being  asked  to  interpret  a 
very  complex  spectrum  when  more  than  one  echo  was  being  received  with 
the  pulse  system,  as  can  be  imagined  from  the  spectrum  of  Figure  7.  With 
the  FM  system  the  spectrum  is  very  much  simpler.  The  spectrum  of  two 
echoes  does  not  seriously  overlap  until  the  frequency  of  the  echoes  are 
separated  by  less  than  4/Ts.  In  the  system  used,  this  was  only  10  cps. 

Several  objects  close  together,  however,  could  not  be  resolved  with  the 
FM  system — three  is  about  the  maximum — and  the  echoes  merge  into  a 
musical  sound  or  a  sound  pattern.  The  ability  to  learn  a  sound  pattern  is 
well  known  from  our  everyday  experience,  but  it  does  not  follow  that  these 
particular  patterns  can  be  learned. 

The  conclusions  drawn  from  the  laboratory  tests  were  very  definite. 
There  was  no  doubt  that  the  wide  band  continuous  FM  system  was  very 
much  superior  to  all  the  pulsed  arrangements  and  was  therefore  the  system 
to  try  as  a  portable  device.  As  a  go/no-go  guidance  aid  the  pulse  arrange¬ 
ments  could  suffice,  but  more  than  this  was  found  to  be  required,  judging 
from  experiments  in  the  past.  Since  the  continuous  wide  band  FM  system 
gave  results  which  were  so  much  better  than  what  must  have  been  obtained 
from  previous  guidance  aids,  there  was  good  reason  for  making  a  portable 
model. 


PORTABLE  FM  ULTRASONIC  GUIDANCE  AID 

A  portable  guidance  aid  has  been  made  using  a  frequency  band  of  30 
kc/sec  in  the  air,  sweeping  from  60  kc/sec  to  30  kc/sec,  and  with  a  maxi¬ 
mum  audible  frequency  of  3  kc/sec.  Two  sweep  rates  are  provided  to  give 


Ultrasonic  Guidance 


151 


10  feet  or  30  feet  maximum  range,  and  the  duration  of  the  echo  tone  is 
approximately  180  msec  with  a  maximum  break  of  20  msec,  or  540  msec 
with  a  maximum  break  of  60  msec  respectively.  Hearing  aid  receivers 
(earpieces)  are  used  for  producing  the  audible  sounds,  and  instead  of 
placing  these  over  the  ears,  as  is  normally  done  with  hearing  aids,  a  small 
diameter  plastic  tube  is  used  to  feed  the  sound  into  the  outer  ear  (meatus). 
It  is  envisaged  that  a  plastic  mold  will  be  used  which  supports  the  tube  in 
the  meatus  but  does  not  seriously  affect  normal  hearing.  The  two  ears  are 
used  because  it  is  believed  that  auditory  balance  will  be  maintained  and 
normal  binaural  auditory  perception  unimpaired.  One  ear  can  be  used  if 
this  is  found  to  be  preferable.  The  ‘torch’  (flashlight  case)  has  an  effective 
beam  of  about  10  degrees. 

The  purpose  of  making  a  portable  aid  with  a  torch  was  mainly  to  de¬ 
termine  the  nature  of  the  auditory  signals  which  would  be  received  from  the 
common  objects  which  are  encountered  in  everyday  life,  and  to  test  the 
unit  for  its  response  to  especially  dangerous  obstacles  such  as  descending 
steps  and  step-down  curbs.  Manholes,  etc.,  come  into  the  same  category.  It 
is  also  necessary  to  see  if  the  type  of  signal  presented  by  the  aid  could  be 
interpreted  by  blind  persons.  Quite  clearly,  if  this  was  not  possible  there 
would  be  little  point  in  proceeding  further.  It  is  emphasized  here  that  the 
use  of  a  torch  is  not  envisaged  as  the  final  form  of  the  aid;  a  binaural  pres¬ 
entation  is  the  ultimate  aim,  but  even  a  torch  has  already  been  found  to 
have  useful  features. 

RESULTS  OBTAINED  WITH  THE  GUIDANCE  AID 

Once  the  aid  was  made  portable  the  potentialities  became  obvious.  The 
results  are  best  described  in  terms  of  the  objects  and  obstacles  encountered, 
but  it  is  impossible  to  describe  the  sounds  as  these  are  characteristic  of  the 
system  alone.  A  sketch  of  the  frequency  spectrum  does,  however,  give  some 
idea  of  the  character  of  the  audio  signals  when  these  are  complex. 

Simple  Objects 

1 )  Smooth  wall:  This  can  be  detected  up  to  30  feet  provided  the  torch  is 
pointed  directly  at  the  wall.  Rotation  of  the  torch  over  an  angle  of  ±  5 
degrees  produces  a  rapid  change  in  the  amplitude  of  the  almost  pure  tone 
echo.  Since  the  signal  intensity  is  higher  than  can  be  obtained  from  any 
other  object  there  is  no  difficulty  in  deciding  it  is  a  smooth  wall  and  the 
direction  in  which  it  runs. 


152  Man-Machine  Systems 

2)  Post:  The  distance  at  which  a  post  can  be  detected  depends  upon  its 
diameter,  but  a  2-inch  diameter  sign  post  can  be  located  at  10  feet,  pro¬ 
vided  the  torch  is  pointing  horizontally.  Again,  rotation  of  the  torch  pro¬ 
duces  a  rapid  change  in  the  signal  intensity,  but  because  the  distance  (as 
determined  by  the  frequency  of  the  signal)  is  less  than  for  a  wall  echo  of 
the  same  intensity  of  signal,  the  object  is  quickly  classified.  A  sharp  comer 
produces  a  similar  effect  but  with  much  lower  signal  level.  Since  a  1 -milli¬ 
meter  diameter  wire  can  be  detected  at  4  feet,  most  sharp  corners  can  be 
detected  before  a  person  collides  with  one. 

3)  Corner  of  room,  etc.:  A  strong  signal  is  received  which  is  of  the  same 
order  as  that  from  a  wall,  but  on  rotating  the  torch  the  sound  changes  in  a 
different  manner  because  of  the  different  geometrical  shape.  Some  practice 
is  required  in  order  to  notice  the  difference. 

Simple  objects  such  as  those  described  produce  signals  which  are 
nearly  pure  tones,  and  many  such  examples  can  quickly  be  brought  to  mind. 

Complex  Objects  Having  a  Well  Defined  Character  in  the 
Signal  Pattern 

1)  Ascending  steps:  Many  tones  in  an  ascending  scale  are  heard  as  the 
torch  is  directed  up  the  steps.  The  sound  is  musical  and  each  step  can  be 
counted  as  one  note  after  another  is  heard  to  start.  This  can  be  understood 
better  from  Figure  9. 

A*T1,TVJ>6 

Moths  vQc*\  sik  -zteCPS 


- > 

Figure  9  Sound  Spectrum  from  Ascending  Steps 

2)  Railings:  These  give  a  similar  sound  pattern,  but  of  course  from  a  dif¬ 
ferent  plane  and  can  therefore  only  be  a  succession  of  upright  posts. 

3)  Bush:  Each  leaf  or  small  branch  produces  its  own  weak  signal,  which 
when  added  to  all  the  others,  each  of  a  different  frequency,  produces  noise 
in  a  limited  band  of  frequencies.  This  band  depends  upon  the  size  of  the 
bush,  as  seen  in  Figure  10. 


Ultrasonic  Guidance 


153 


4)  Gravel  path:  Each  stone  produces  its  own  signal  as  for  the  leaves  of  a 
bush,  but  the  frequency  band  is  much  greater — depending  upon  the  angle  of 
the  beam.  Grass  has  a  similar  effect.  When  compared  with  a  smooth  path  or 
road,  the  signal  is  very  strong,  but  even  concrete  paving  gives  a  detectable 
return,  as  seen  from  Figures  1 1  and  12. 


Figure  10  Sound  Spectrum  from  a  Small  Bush 


M^EQw/BNCy 


Figure  1 1  Sound  Spectrum  from  Gravel  or  Grass 


r\HPtiTvt>e_ 


Mil.  ,  l>  UllAv  I 

OP-  "P AVIH4  STToMff. 


Ftf  £<?i S(=-MCy 


Figure  12  Sound  Spectrum  from  Paving 


5)  Descending  steps:  These  are  detected  not  by  the  presence  of  a  signal 
but  its  absence.  The  background  from  the  pathway  (paving,  etc.)  suddenly 
ceases  at  about  6  feet  distance  if  the  torch  is  pointed  slightly  towards  the 
ground.  The  subject  then  realizes  the  break  in  the  surface  and  proceeds 
cautiously  to  check  if  it  is  a  hole  or  steps.  If  a  hole,  there  is  an  echo  from 
the  far  wall,  indicating  the  width  of  the  hole.  Descending  steps  can  be 
counted  for  a  distance  of  three  or  four  steps  by  noting  the  change  in  signal 
as  the  torch  is  directed  down  the  steps  at  an  almost  vertical  angle.  Diffi¬ 
culty  is  experienced  when  the  floor  is  smooth  since  the  background  return 


154  Man-Machine  Systems 

is  weak,  but  this  the  subject  knows  about  and  he  can  take  extra  care  if  steps 
are  likely  to  be  encountered. 

6)  Pedestrians  are  observed  by  the  rapid  change  in  the  frequency  of  their 
echo  signals  as  they  approach  or  pass  ahead. 

7)  Objects  fluttering  in  the  breeze  are  observed  by  the  wavering  of  the 
echo  frequency;  tall  blades  of  grass  are  easily  heard. 

The  performance  of  the  device  is  unimpaired  in  a  60  mph  gale  and  no 
serious  fading  of  signals  has  been  observed.  This  is  partly  accounted  for  by 
the  fact  that  few  echoes  are  of  absolutely  constant  amplitude  because  of 
the  wide  frequency  sweep  in  the  transmission  medium.  Variation  in  ampli¬ 
tude  due  to  atmospheric  conditions  is  therefore  unnoticed.  Even  heavy 
rain  seems  to  have  no  effect  and  fog  made  no  difference  to  the  perform¬ 
ance. 

There  are  many  more  complex  objects  than  those  given,  and  it  may  now 
be  appreciated  that  an  infinite  variety  of  signals  is  possible.  We  all  have 
learned  many  sound  patterns — speech  is  an  excellent  example — and  it  is 
not  unlikely  a  blind  person  could  learn  the  sound  patterns  from  this  system, 
but  just  how  well  has  yet  to  be  determined. 

Only  a  few  blind  persons  have  tried  the  guidance  aid  at  the  time  of 
writing,  and  these  for  only  short  periods  of  time  and  without  training.  Their 
reactions  were  sufficiently  encouraging,  however,  for  an  order  to  be  placed 
with  a  manufacturer  for  ten  pocket  units  so  that  a  controlled  evaluation 
could  be  made  over  a  period  of  time.  One  unit,  tried  by  untrained  persons, 
is  not  likely  to  provide  the  required  information.  The  fact  that  ten  units 
have  been  ordered  does  not,  however,  mean  that  the  aid  is  already  con¬ 
sidered  to  be  a  success. 

POTENTIALITIES  AND  PROBLEMS 
OF  A  BINAURAL  GUIDANCE  AID 

A  binaural  aid  requires  a  wide  transmission  beam  as  well  as  a  wide  re¬ 
ceiving  beam  for  each  channel.  The  two  receiving  transducers  must  be 
spaced  about  the  same  distance  apart  as  the  ears  and  separate  amplifiers 
are  required  for  each  channel.  The  advantage  of  course  is  the  wide  arc 
from  which  echoes  can  be  received  without  movement  of  the  head — as  this 
is  where  the  transducers  must  be  worn. 

As  explained  in  an  earlier  section,  the  normal  reaction  to  binaural  sig¬ 
nals  is  to  relate  the  signal  to  some  position  in  space.  This  we  have  learned 
to  do  from  birth,  and  in  fact  achieve  remarkable  perception  within  a  few 


Ultrasonic  Guidance 


155 


months,  as  can  be  observed  in  any  baby.  Can  one  do  the  same  with  a 
binaural  guidance  aid?  If  this  were  possible  the  objects  themselves  may 
appear  to  make  their  own  signals  and  a  sound  picture  in  at  least  two  di¬ 
mensions  would  be  obtained.  There  is  only  slight  evidence  in  support  of 
this  hypothesis  at  present,  as  little  work  has  been  done  which  is  sufficiently 
relevant.  Tests  with  subjects  on  binaural  acuity  revealed  that  the  subjects 
felt  the  sound  to  be  within  the  head.  This  has  been  called  lateralization  as 
opposed  to  localization  which  is  the  normal  process.  In  these  cases  the 
sound  was  fed  to  each  ear  by  headphones  and  the  subjects  were  tested  only 
at  irregular  intervals.  They  certainly  did  not  use  the  headphones  con¬ 
tinuously.  It  is  probably  only  by  some  association  with  an  external  source 
of  sound  that  one  can  obtain  normal  binaural  effects,  and  the  author  has 
been  unable  to  find  evidence  that  this  has  been  investigated.  Jeffress  and 
Taylor  (4)  reported  that  some  subjects  had  the  experience  of  feeling  the 
sound  to  come  from  an  external  source  but  this  was  not  general — it  de¬ 
pended  to  some  extent  on  what  was  asked  of  them. 

A  binaural  guidance  aid  has  been  built  and  at  the  time  of  writing  tests 
are  about  to  be  started.  Because  the  sensitivity  falls  rapidly  with  an  in¬ 
crease  in  the  transmission  and  reception  arc  when  disc  radiators  are  used, 
specially  curved  transducers  have  been  developed.  Adequate  sensitivity  has 
not  yet  been  obtained  for  normal  guidance  purposes,  but  the  sensitivity 
available  is  sufficient  for  tests. 

CONCLUSION 

It  has  been  shown  that  the  most  efficient  guidance  system  using  ultrasonic 
transmission  is  in  the  form  of  a  frequency-modulated  wave  similar  to  that 
used  by  bats,  who  rely  entirely  upon  ultrasonic  waves  for  their  orientation. 
Information  can  be  gathered  about  one’s  surroundings  in  the  form  of  sound 
patterns — some  simple,  and  others  complex — and  it  is  possible  a  blind 
person  can  learn  to  interpret  these.  It  is  essential  that  the  normal  auditory 
cues  obtained  by  a  blind  person  are  not  destroyed,  but  they  could  be  sup¬ 
pressed,  if  desired,  when  superior  information  which  was  easily  learned 
was  available.  Tests  are  to  be  carried  out  on  a  small  group,  each  with  his 
own  aid,  over  a  period  of  time,  to  determine  how  effective  and  acceptable 
the  device  can  be. 

REFERENCES 

1.  Beurle,  R.  L.,  “Electronic  Guiding  Aids  for  Blind  People.”  Elec.  Eng.,  Vol.  23 
(1951),  p.  2. 


156  Man-Machine  Systems 

2.  Busher,  W.  E.,  “Hearing  Aids  and  Blind  Guidance  Devices,”  Electronics ,  Vol.  34 

(1961),  p.  43. 

3.  Griffin,  D.  R.,  Listening  in  the  Dark.  New  Haven,  Connecticut:  Yale  University 

Press,  1958. 

4.  Jeffres,  L.  A.,  and  R.  W.  Taylor,  “Lateralization  v.s.  Localization,”  J.  Acoust. 

Soc.  Atner.,  Vol.  33  (1960),  p.  482. 

5.  Kay,  L.,  “A  Plausible  Explanation  of  the  Bat’s  Echo  Location  Acuity,”  Anim. 

Behav.,Vo\.  10  (1962),  pp.  34-41. 

6.  Kuhl,  W.,  G.  R.  Schodder,  and  F.  K.  Schroder,  “Condenser  Transmitters  and 
Microphones  with  Solid  Dielectric  for  Airborne  Ultrasonics,”  Acoustica,  Vol. 
4  (1954),  p.  519. 

7.  Pye,  J.  D.,  “A  Theory  of  Echo  Location  by  Bats,”  J.  Laryng.  Otol.,  Vol.  74  (1960), 

pp.  718-729. 

8.  Research  on  Guidance  Devices  for  the  Blind.  New  York:  Haskins  Laboratories, 

1947. 

9.  Strother,  G.  K.,  “Note  on  the  Possible  Use  of  Ultrasonic  Pulse  Compression  by 

Bats,”  J.  Acoust.  Soc.  Atner.,  Vol.  33  ( 1961 ),  p.  696. 


ACTIVE  ENERGY  RADIATING  SYSTEMS: 


THE  80-CHANNEL  ELEKTROFTALM 

WITOLD  STARKIEWICZ 

Pomeranian  Medical  Academy,  Szczecin,  Poland 

TADEUSZ  KULISZEWSKI 

Wroclaw  Technical  University,  Wroclaw,  Poland 


INTRODUCTION 

The  elektroftalm  is  an  apparatus  to  enable  the  blind  to  recognize  objects 
in  their  surroundings  through  the  use  of  the  action  of  light  rays  coming 
from  those  objects  upon  photosensitive  elements.  The  name  “elektroftalm” 
was  coined  in  1897  by  Noiszewski  for  an  apparatus  built  by  himself.  In  his 
device  Noiszewski  made  use  of  the  photosensitive  properties  of  selenium; 
he  was  the  first  to  build  an  apparatus  for  the  blind  based  on  these  properties. 

Noiszewski’s  elektroftalm  consisted  of  a  selenium  photocell  placed  on 
the  forehead  of  a  blind  person,  and  a  device  which  transformed  light  stimuli 
falling  on  the  selenium  photocell  into  sound  stimuli.  A  blind  person 
equipped  with  this  type  of  elektroftalm  would  hear  a  signal  in  his  earphones 
when  he  turned  toward  a  source  of  light.  He  could  thus  recognize  the 
position  of  a  window,  a  lamp,  and  so  forth.  Noiszewski’s  other  projects  in¬ 
cluded  the  construction  of  an  apparatus  to  transform  light  signals  into  tactile 
or  thermal  cues.  In  his  time,  however,  technical  possibilities  were  much 
smaller  than  they  are  now,  and  it  was  impossible  to  construct  a  device 
which  gave  the  blind  substantially  greater  spatial  orientation  than  they  en¬ 
joyed  without  any  device. 

The  general  principle  on  which  the  modern  elektroftalm  is  based  is  of 
the  change  of  light  stimuli  to  tactile  stimuli  sensed  on  the  skin  of  the  blind 
user.  In  short,  it  converts  light  energy  into  mechanical  energy.  The  process 
of  change  is  very  simple  in  itself,  and  with  the  present  development  of  en¬ 
gineering  science  such  a  device  can  be  constructed  easily. 

The  idea  for  constructing  such  an  apparatus  for  the  blind  and  the  con¬ 
ception  of  the  project  belong  to  Dr.  Starkiewicz,  head  of  the  Chair  for 
Ophthalmology  of  the  Pomeranian  Medical  Academy.  Dr.  Kuliszewski,  as 
the  head  of  the  Chair  for  Telecommunication  Devices  of  the  Wroclaw 


157 


1 5  8  Man-Machine  Systems 

Technical  University,  was  given  the  tasks  of  developing  the  project  theo¬ 
retically,  directing  the  technological  aspects  of  development,  and  producing 
an  apparatus  based  on  the  supplies  available  in  Poland.  The  task  was  very 
difficult  in  the  beginning,  since  not  all  the  elements  available  in  Poland 
needed  for  the  construction  were  of  satisfactory  technical  quality.  This  was 
especially  true  with  reference  to  semiconductor  elements. 

THE  METHOD  AND  THE  APPARATUS 

The  process  of  formation  of  a  plastic  image  on  the  skin  of  a  blind  indi¬ 
vidual  is  similar  to  the  process  of  formation  of  an  image  in  the  sighted 
person’s  brain.  The  human  eye  is  replaced,  in  the  elektroftalm,  with  an 
optical  camera  in  which  the  image  is  formed  on  a  mosaic  composed  of  a 
large  number  of  semiconductor  photoelements.  On  this  mosaic,  then,  a 
point  analysis  of  an  image  is  effected.  Each  point  of  the  image  on  the 
mosaic  (i.e.,  each  semiconductor  photoelement)  changes  a  light  stimulus 
into  an  electrical  stimulus.  The  optical  nerves  are  replaced  by  wires  and 
electronic  amplifiers  to  amplify  the  signal.  The  receiver  of  the  image,  in  the 
elektroftalm,  where  the  point  synthesis  of  the  image  is  effected,  is  a  system 
of  special  tactile  elements,  the  number  of  which  is  equal  to  the  number  of 
photoelements  in  the  mosaic.  The  tactile  elements  change  electrical  into 
mechanical  stimuli  and  evoke  the  phenomenon  of  touch  on  the  skin  of  the 
user.  It  follows,  therefore,  that  each  analyzed  point  of  the  image  is  trans¬ 
ferred  onto  the  skin  of  the  user  by  means  of  a  separate  photoelectronic 
channel.  One  major  difference  in  our  analogy,  of  course,  is  that  while  the 
number  of  analogous  channels  in  the  visual  channel  amounts  to  several 
millions,  the  number  of  photoelectronic  elements  we  can  use  is  strictly 
determined  by  the  capability  of  the  skin  to  discern  tactile  stimuli. 

Figure  1  shows  one  photoelectronic  channel  of  the  elektroftalm.  Its 


ob/ective  transistor  forehead 


Figure  1  Diagrammatic  Representation  of  One  Photoelectronic  Channel 
of  the  Elektroftalm 


The  80-Channel  Elektroftalm  159 

operation  is  as  follows.  A  light  ray,  reflected  from  a  lighted  object,  falls 
onto  a  photoelement  of  the  mosaic  in  the  optical  camera.  That  element 
produces  an  electrical  pulse  output  whose  value  is  a  function  of  the  bright¬ 
ness  of  the  incident  light.  The  pulses  are  amplified  by  transistor  amplifiers 
and  fed  to  a  corresponding  tactile  element;  in  our  device  this  is  of  the  elec¬ 
tromagnetic  type.  One  necessary  condition  is  that  the  intensity  of  tactile 
stimulation  should  be  a  function  of  the  brightness  or  intensity  of  the  in¬ 
cident  light.  This  is  the  most  difficult  condition  to  fulfill. 

The  area  most  sensitive  to  touch  is  the  finger  tip,  used  by  the  blind  to 
read  braille.  This  area  is  too  small,  however,  for  the  reproduction  of  a 
plastic  image.  In  the  elektroftalm  the  image  is  produced  instead  on  the 
forehead.  The  forehead  is  less  sensitive  to  touch  and  incapable  of  dis¬ 
criminating  tactile  stimuli  that  are  very  close  to  one  another.  Because  of 
this  property  of  the  forehead,  and  of  the  necessary  size  of  the  tactile  element, 
the  number  of  the  photoelectronic  channels  in  the  elektroftalm  cannot  be 
large.  In  the  first  model,  the  number  of  channels  was  80;  in  the  new  model 
now  under  construction  this  has  been  increased  to  120. 

It  follows  from  this  description  that  the  elektroftalm  consists  of  three 
basic  parts  connected  by  wire  with  one  another: 

( 1 )  an  optical  camera  with  a  mosaic  of  photoelements  where  the  image 
is  formed; 

(2)  a  set  of  transistor  amplifiers  and  suitable  input/output  matching 
circuitry;  and 

(3)  a  set  of  tactile  elements  to  produce  a  tactile  (plastic)  image. 

Each  portion  of  the  system  must  satisfy  rigid  requirements  for  quality  and 
trouble-free  operation.  It  is  desirable,  moreover,  that  size  and  weight  be 
minimal.  Let  us  consider  these  system  elements  one  at  a  time. 

THE  OPTICAL  CAMERA 

The  execution  of  a  design  of  an  optical  camera  with  the  required  objective 
lens,  some  means  of  manual  control  of  sharpness  of  the  picture  image,  and 
provision  of  an  adjustable  diaphragm — all  do  not  entail  any  particular 
technological  difficulties,  even  if  the  size  of  the  whole  camera  must  be 
quite  small.  Its  size  is  determined,  however,  by  the  size  of  photoelement 
used,  the  number  of  photoelements  used  in  the  mosaic,  and  the  width  of  the 
angle  of  view  (especially  in  the  horizontal  plane).  The  field  of  vision  in  the 
horizontal  plane  of  an  average  man’s  eye  is  about  120  degrees  (with  both 
eyes,  180  degrees),  while  the  resolution  is  about  one  minute  of  arc  (under 


1 60  Man-Machine  Systems 

the  best  conditions).  Partially  sighted  persons  whose  eyes  have  a  resolving 
power  of  30  to  50  minutes  of  arc  can  enjoy  tolerable  orientation  in  their  sur¬ 
roundings,  even  if  their  field  of  vision  is  very  small  (as  with  so-called 
“telescopic”  or  “tunnel”  vision) . 

In  the  first  80-channel  elektroftalm  model,  the  resolution  was  about  3 
degrees  and  the  field  of  view  in  the  horizontal  plane  was  about  36  degrees. 
In  the  newer  120-channel  model  the  resolution  is  of  the  order  of  2  degrees 
with  a  field  of  view  of  about  28  degrees.  The  first  model  of  the  optical 


Figure  2  First  Model  of  the  80-Channel  Optical  Camera 


The  80-Channel  Elektroftalm  161 

camera  is  shown  in  Figure  2,  and  Figure  3  shows  the  camera  unmounted. 
In  the  middle  of  the  picture  is  the  mosaic  of  80  photoelements. 

The  photoelements  used  in  the  first  model  were  germanium  photocells, 
which  had  a  number  of  serious  drawbacks,  including: 

( 1 )  low  sensitivity  and  dropping  sensitivity  with  time; 

(2)  large  variation  in  sensitivity  with  ambient  temperature; 

(3)  undesirable  distribution  of  sensitivity  over  the  spectrum,  with  max¬ 
imum  sensitivtiy  in  the  infrared  region;  and 

(4)  relatively  large  size. 

These  drawbacks  are  overcome  in  the  newer  model  with  the  use  of  silicon 
cells,  which  are  characterized  by  good  sensitivity  over  the  visible  spectrum, 
minimal  temperature-related  variations  in  output,  conveniently  small  size, 
and  high  output.  Figure  4  shows  a  comparison  of  the  parameter  of  sensitiv- 


Figure  4  Spectral  Sensitivity  of  Germanium  Versus  Silicon  Photocell 
Types 

ity  of  the  two  photoelements.  The  region  of  visible  light  is  shaded.  Unfor¬ 
tunately  silicon  elements  are  not  yet  produced  in  Poland,  which  necessitates 
importation  of  the  elements  and  thus  makes  them  hard  to  obtain.  If  we 
could  use  quite  small  silicon  elements,  it  would  be  possible  to  reduce  the 
size  and  weight  of  the  optical  camera  considerably. 

THE  TRANSISTOR  AMPLIFIERS 

The  execution  of  transistor  amplifiers  in  compact  configurations  is  suffi¬ 
ciently  developed  in  Poland  for  our  needs.  This  is  not,  therefore,  an  im- 


162  Man-Machine  Systems 

portant  problem  for  us.  To  obtain  highly  compact — even  microminiature — 
components,  however,  foreign-made  elements  will  have  to  be  used. 

THE  TACTILE  TRANSDUCER 

The  most  important  problem  so  far,  and  one  which  is  not  yet  solved  com¬ 
pletely,  is  the  development  of  a  tactile  element  that  will  yield  an  increment 
in  tactile  pressure  proportional  to  the  brightness  (intensity)  of  the  light 
falling  on  the  object  scanned,  to  make  it  possible  to  recognize  about  ten 
degrees  of  brightness  throughout  the  image  used.  For  this  purpose,  the  dy¬ 
namic  characteristic  of  the  electromagnetic  element  should  be  straight-line 
(monotonic).  Actually,  the  dynamic  characteristic  of  the  elements  built 
so  far  has  proved  to  be  of  the  type  shown  in  Figure  5,  from  which  we  con- 


Figure  5  Dynamic  Characteristics  of  the  Electromagnetic  Tactile  Element 


elude  that  for  very  small  light  intensities  the  action  of  the  unit  is  not  per¬ 
ceptible  to  the  user — in  spite  of  the  fact  that  the  electrical  signal  reaching 
the  element  is  proportional  in  value  to  brightness.  Beyond  a  certain  critical 
value  of  brightness,  the  action  of  the  tactile  element  rises  rapidly  to  a  max¬ 
imum.  In  practical  terms,  only  about  two  or  three  degrees  of  variation  of 
brightness  level  can  be  discriminated.  The  mean  power  consumed  for  the 
operation  of  this  element  is  about  30  to  50  milliwatts. 

Another  difficulty  was  in  specifying  the  mode  of  operation  of  the  tactile 
element,  i.e.,  what  kind  of  tactile  stimulus  should  be  produced.  It  ap¬ 
peared  that  a  static  pressure  stimulus  was  perceived  only  at  the  moment  of 
onset,  whereupon  adaption  followed  quickly  and  the  pressure  effect  dis¬ 
appeared.  Any  discrimination  of  degrees  of  brightness  became  impossible. 
With  the  introduction  of  a  moving  stimulus  with  a  frequency  variation  of 
about  10  cps,  the  tactile  effect  was  made  to  appear  constant. 


163 


The  80-Channel  Elektroftalm 
FIELD  TESTS  OF  THE  ELEKTROFTALM 

Before  trying  the  new  device,  a  decision  had  to  be  made  of  the  best  way  of 
positioning  the  optical  camera.  At  first  the  optical  camera  was  placed  on 
the  blind  subject’s  head,  and  connected  to  the  forehead  plate  which  con¬ 
tained  the  tactile  elements.  An  improved  mounting  system  is  shown  in 
Figure  6,  and  Figure  7  shows  how  it  is  worn  on  the  head  of  the  user,  with 
the  front  cover  removed  to  show  the  tactile  elements. 

Initial  experiments  in  using  the  80-channel  elektroftalm  using  germa- 


Figure  6  Method  of  Head  Mounting  of  Optical  Camera  and  Tactile  Ele¬ 
ment  Assembly 


Figure  7  Blind  User  Wearing  Head-Mounted  Assembly.  The  cover  of  the 
tactile  element  assembly  has  been  removed  to  show  detail  of  construction. 


1 64  Man-Machine  Systems 

nium  photodiodes  were  carried  out  with  a  40-year-old  woman  who  was 
blind  from  birth.  The  experiment  was  carried  through  a  fortnight.  As  a 
result  of  practice,  the  subject  began  to  localize  correctly,  and  she  would 
point  with  her  hand  to  large  white  objects  against  a  black  background 
when  these  were  placed  at  various  positions  in  her  field  of  vision,  whose 
dimensions  were  36  degrees  horizontally  and  20  degrees  vertically.  She 
also  recognized  the  shapes  of  some  simple  objects  like  horizontal  and 
vertical  lines  which  were  moved  in  various  directions.  To  aid  recognition 
of  shape  and  position,  the  subject  would  move  her  head  at  will,  and  this 
made  it  easier  for  her  to  recognize  the  objects  correctly. 

The  experiment  had  to  be  discontinued  because  of  the  deterioration  of 
the  apparatus :  the  number  of  tactile  elements  working  gradually  reduced  to 
20  toward  the  end  of  this  period.  The  elements  themselves  were  operative, 
but  the  whole  device  became  inoperative  due  to  factory  faults  in  the  ger¬ 
manium  photoelements  and  the  transistors  used.  As  indicated  above,  the 
device  also  responded  mainly  to  infrared  light  and  very  little  to  visible  light; 
this  was  a  function  of  the  germanium  properties.  In  addition  to  these  diffi¬ 
culties,  the  weight  of  the  device  on  the  head  was  too  great  (1200  grams) 
as  was  also  the  auxiliary  equipment  such  as  the  amplifiers  and  batteries. 

PLANS  FOR  THE  FUTURE 

In  the  new  120-channel  elektroftalm  now  under  construction,  silicon 
photodiodes  will  be  used  which,  as  indicated  above,  will  have  none  of  the 
drawbacks  of  the  germanium  photoelements,  and  are  extremely  sensitive  as 
well.  Brightness  levels  of  20  to  40  luxes  (about  0.19  to  0.37  foot-candles) 
influence  the  tactile  elements  perceptibly.  The  construction  of  the  tactile 
elements  is  also  better;  they  are  much  lighter.  Preliminary  experiments  with 
the  silicon  photodiode  version  of  the  new  elektroftalm,  with  only  a  few 
channels  in  operation,  were  successful  in  every  sense.  Figure  8  shows  the 
120-channel  elektroftalm  and  the  way  it  is  used.  A  box  placed  on  the  chest 
of  the  user  contains  all  the  transistor  amplifiers  together  with  the  associated 
circuitry.  The  optical  camera  has  been  placed  here  on  top  of  the  box,  for 
we  wish  to  discover  whether  it  is  in  fact  indispensible  for  the  optical  camera 
to  be  on  the  user’s  head.  In  any  event,  the  camera  can  be  replaced  on  the 
head  quickly  and  with  no  modification  (see  Figure  9).  As  the  reader  can 
see  from  the  illustration,  the  requirements  for  portability  have  been  satisfied. 

Our  future  plans  include  completing  the  120-channel  elektroftalm  and 
carrying  out  further  tests  with  the  blind.  Special  stress  is  being  laid  on  the 
development  of  a  suitable  tactile  element,  which  may  not  necessarily  be  of 


165 


The  80-Channel  Elektroftalm 

the  electromagnetic  type.  The  help  of  specialists  would  be  very  desirable 
and  much  appreciated.  In  this  connection  we  should  mention  that  plans  for 
future  work  include  not  only  the  use  of  silicon  cells  but  also  the  use  of 
television  techniques.  These  changes  will  make  it  possible  not  only  to  use 
visible  light  but  to  increase  the  number  of  photoelectronic  channels,  i.e., 


Figure  8  The  120-Channel  Elektro-  Figure  9  An  Alternative  Mounting  of 

ftalm  in  Use.  Only  three  channels  are  the  Optical  Camera  with  the  120-Channel 

actually  operative  due  to  short  supply  of  Elektroftalm 
the  requisite  components. 

it  will  permit  the  blind  user  to  recognize  a  larger  number  of  details  of  ob¬ 
jects  in  his  environment.  It  would  appear  necessary  then  to  make  use  of 
the  skin  of  the  chest  of  the  user  as  a  receptor  of  the  sensing  stimuli  in  order 
to  produce  a  picture  of  objects  in  the  environment.  The  chest  can  accom¬ 
modate  a  considerably  larger  number  of  stimulators  than  the  forehead. 
Tactile,  vibratory,  electric,  and  other  stimulators  are  being  considered. 

In  the  over-all  estimate  of  the  results  of  our  work  on  the  construction 
and  application  in  use  of  the  elektroftalm,  the  physiological  results  should 
be  considered  separately  from  the  developmental  or  construction  results 
of  technique.  It  is  mainly  in  this  latter  domain  in  which  our  achievements 


166  Man-Machine  Systems 

lie.  Lesser  results  in  the  physiological  domain  have  not  followed  from  in¬ 
adequate  theoretical  conceptualization,  but  are  due  solely  and  exclusively 
to  technical  shortcomings  in  the  device  that  we  could  not,  hitherto,  over¬ 
come.  It  is  to  be  hoped  that  the  experience  we  have  acquired  will  enable  us 
to  construct  new  and  improved  models  of  the  elektroftalm  which  will  allow 
us  to  obtain  yet  better  results  from  the  physiological  point  of  view.  We 
are  confident  that  the  planned  further  models  of  the  elektroftalm  will  pro¬ 
vide  the  blind  with  spatial  orientation  similar  to  that  possessed  by  people 
who  can  see. 


ACTIVE  ENERGY  RADIATING  SYSTEMS: 
AN  ELECTRONIC  TRAVEL  AID 


THOMAS  A.  BENHAM 

Haverford  College,  Haverford,  Pennsylvania 

J.  MALVERN  BENJAMIN,  JR. 

Biophysical  Electronics  Division  of  Communications 
Industries,  Bala  Cynwyd,  Pennsylvania 


INTRODUCTION 

There  would  appear  to  be  approximately  10,000  blind  people  throughout 
the  country  who  could  benefit  from  the  use  of  an  adequate  electronic 
travel  aid.  The  past  fifteen  years  of  experimentation  have  indicated  that 
any  such  device  for  detecting  hazards  during  travel  by  the  blind  should 
fulfill  three  main  requirements:  1)  it  should  leave  the  ears  free  for  de¬ 
tecting  natural  auditory  cues;  2)  it  should  detect  obstacles  and  indicate 
roughly  their  proximity;  and  3)  it  should  detect  discontinuities  in  the  ter¬ 
rain.  The  latter  two  functions  should  present  different  tactile  stimuli  to  the 
user. 

About  20  guidance  devices,  working  on  a  variety  of  principles,  have 
been  built  in  the  last  15  years.*  A  few  have  operated  from  ambient  light 
reflected  by  the  obstacle,  but  most  have  radiated  a  beam  of  sound  or  elec¬ 
tromagnetic  radiation  and  detected  the  reflected  ray.  In  most  cases  range 
information  is  elicited;  information  regarding  the  presence  or  absence  of 
the  obstacle  is  obtained  also. 

Historically  most  investigators  have  considered  the  detection  of  sub¬ 
stantial  obstacles  such  as  walls  or  doors  as  their  principal  goal.  Gradually 
they  have  realized  the  importance  also  of  step-down  detection,  more  dif¬ 
ficult  than  the  detection  of  large  conventional  obstacles;  and  of  step-up 
detection  which,  while  superficially  related  to  obstacle  detection,  presents  a 
further  difficulty. 

Throughout  the  development  of  these  devices,  the  greatest  emphasis 


*  Some  of  the  earlier  models  with  details  of  operation  and  evaluation  are  dis¬ 
cussed  in  Reference  8  (q.  v.). 


167 


168  Man-Machine  Systems 

has  been  upon  systems  by  which  an  otherwise  able-bodied  individual,  lack¬ 
ing  only  vision,  could  find  his  way  about  safely  at  a  reasonable  walking 
speed  of  three  or  four  miles  per  hour.  The  individual  reaction  time,  stumble 
recovery,  and  general  physical  condition  are  expected  to  be  normal  or  per¬ 
haps  even  somewhat  highly  developed. 

Reflection  of  sound  forms  the  most  versatile  source  of  travel  informa¬ 
tion  available  to  the  blind.  Sound  from  the  tap  of  a  cane  or  the  scuff  of  a 
foot  reflected  from  an  obstacle  will  announce  its  presence  to  the  acutely 
perceptive,  but  the  newly  blind  or  the  experienced  traveler  in  a  noisy 
street  will  find  these  echoes  too  faint.  To  overcome  these  problems  several 
workers  have  developed  small,  easily  carried  horns  that  generate  high  fre¬ 
quency  sound  in  sharp  bursts.  Twersky  (6)  and  Witcher  (7)  each  pro¬ 
duced  useful  instruments  of  this  type.  It  is  the  writers’  point  of  view  that  the 
travel  aid  should  engage  the  native  ability  of  the  user  to  the  maximum.  The 
instrument  should  not  try  to  replace  functions  that  the  human  being  already 
possesses.  Thus,  at  first  glance,  an  instrument  using  sound  might  seem  to 
be  very  desirable,  since  it  does  employ  the  natural  faculties  to  a  maximum. 
However,  it  is  not  possible  to  obtain  a  sufficiently  narrow  beam  for  fre¬ 
quencies  in  the  audible  range. 

There  is  one  factor  in  favor  of  using  ultrasonic,  but  several  against  it. 
On  the  positive  side,  the  speed  of  transmission  of  ultrasonic  waves  makes  it 
possible  to  employ  radar  or  modified  radar  principles.  This  is  not  possible 
with  electromagnetic  waves  since  the  distances  are  too  short  and  the  speed 
of  transmission  too  high.  However,  none  of  the  ultrasonic  devices  built* 
has  proved  satisfactory  because  of  the  very  nature  of  the  conducting  med¬ 
ium:  air.  Thermal  and  convection  currents  introduce  refraction  effects  that 
frequently  obliterate  the  signals  completely. 

Reflection  using  feasible  frequencies  is  also  quite  specular,  which 
makes  it  necessary  to  have  the  incident  beam  of  energy  strike  the  surface 
to  be  detected  at  right  angles;  this,  of  course,  is  not  practical.  This  specu¬ 
larity  is  due  to  the  fact  that  most  surfaces  are  too  “smooth”  for  waves  of  a 
centimeter  or  so  in  length.  If  the  wavelength  is  reduced  to  the  point  where 
surfaces  scatter  sufficiently,  absorption  by  the  air  makes  the  amount  of 
power  required  prohibitively  high.  Furthermore,  the  physical  size  of  radia¬ 
tors  and  receivers  is  excessively  large. 

Radar  principles  employing  electromagnetic  radiation  are  not  feasible 


*  The  Hoover  Company,  the  Brush  Development  Company,  and  Stromberg- 
Carlson  have  built  such  devices.  For  further  information  see  References  4  and  5. 


Electronic  Travel  Aid 


169 


because  of  the  extremely  short  range  (0  to  20  feet).  One  system  was  tried 
employing  electromagnetic  radiation,  comprising  a  quarter-wave  antenna 
which  changed  the  frequency  of  a  high  frequency  radio  oscillator  when  it 
neared  an  obstacle,  but  ground  effects  prevented  the  detection  of  most 
obstacles. 

In  another  approach  the  Franklin  Institute  and  others*  developed 
electrical  canes,  using  long-wave  electromagnetic  radiation  (2  Me),  which 
sense  the  distance  between  the  tip  of  the  cane  and  the  ground  or  obstacles. 
With  this  device,  the  user  has  essentially  the  protection  of  a  normal,  rea¬ 
sonably  long  cane,  but  usually  does  not  have  to  touch  the  ground  physically. 
He  is  able  to  carry  the  tip  of  the  cane  a  few  inches  off  the  ground  without 
receiving  any  tactile  or  audible  signal.  If  the  tip  then  passes  over  the  edge 
of  a  curb,  the  distance  to  the  ground  suddenly  increases,  the  capacitive  effect 
between  the  end  of  the  cane  and  the  ground  changes  abruptly,  and  a  signal 
is  given.  This  approach  has  the  disadvantage  that  the  detection  range  is 
limited  to  only  a  few  inches  more  than  the  length  of  the  cane;  thus  it  cannot 
provide  early  warning. 

Three  devices  of  interest  have  been  developed  using  electromagnetic 
radiation  in  the  infrared  and  in  the  visible  regions.  One  of  these  was  an 
extremely  ingenious  and  well  engineered  instrument  called  “Optar”  (3). 
Detecting  ambient  light  reflected  from  the  obstacle,  it  determined  the  range 
by  automatically  locating  the  distance  behind  a  lens  at  which  the  image 
of  the  obstacle  was  in  sharp  focus.  Information  was  presented  as  a  tone, 
the  pitch  of  which  indicated  distance.  In  practice,  the  tone  was  complex 
and  difficult  to  interpret,  and  the  device  in  the  form  in  which  it  was  con¬ 
structed  was  not  capable  of  detecting  curbs. 

Another  device  working  in  the  same  part  of  the  spectrum  but  using 
optical  triangulation  was  developed  by  the  U.  S.  Army  Signal  Corps  (2). 
The  Radio  Corporation  of  America  later  manufactured  25  experimental 
units.  The  principle  of  operation,  briefly,  is  as  follows.  A  beam  of  light 
interrupted  500  times  per  second  by  a  motor-driven  chopper  emanates  in 
a  narrow  beam  from  an  optical  system,  strikes  an  object  or  the  ground,  and 
is  reflected  back  into  the  second  optical  system.  This  second  system  focuses 
the  image  of  the  spot  of  light  reflected  from  the  object  on  a  coding  disc, 
behind  which  is  located  a  photoelectric  cell.  The  output  of  the  photoelectric 
cell  is  fed  through  an  electronic  amplifier  tuned  to  500  cycles  to  a  vibrator  in 


*  August  W.  McCullum  of  SITE  (Sight  Implementation  through  Electronics), 


Inc. 


170  Man-Machine  Systems 

the  handle  of  the  instrument.  The  coding  disc  interrupts  the  reflected  light 
4,  8,  16,  or  32  times  per  second,  depending  on  the  distance  to  the  object. 
Thus  the  500  cycle  pulses  in  the  handle  inform  the  user  as  to  the  distance 
to  an  obstacle,  while  the  position  of  the  handle  indicates  the  azimuth;  that 
is,  whether  the  obstacle  is  to  the  left,  to  the  right,  or  straight  ahead. 

Under  the  Committee  on  Sensory  Devices,  the  Franklin  Institute 
Laboratories  for  Research  and  Development  worked  on  an  optical  triangu¬ 
lation  ranging  system  using  ultraviolet  light.  Ultimately,  the  basic  idea  was 
given  up  because  of  numerous  technical  difficulties  attendant  upon  the  use 
of  radiation  at  these  frequencies. 

THE  HAVERFORD  PROJECT 

In  1950  the  Veterans  Administration  contracted  with  Haverford  College 
to  field  test  the  25  models  of  the  Signal  Corps  Sensory  Aid  with  67  blind 
subjects.  The  report  resulting  from  this  investigation  contained  a  list  of 
recommendations  for  future  development  (1).  In  1953,  under  a  continua¬ 
tion  of  this  contract,  development  of  an  improved  electronic  travel  aid  was 
begun.  Haverford  College  subcontracted  the  laboratory  investigation  to 
Biophysical  Instruments,  Inc. 

Work  on  the  project  has  resulted  in  the  production  of  three  prototype 
obstacle  detectors  that  are  at  present  being  field  tested,  and  appear  to  pro¬ 
vide  a  potentially  satisfactory  instrument.  In  addition,  a  portable  study 
model  of  a  curb  detector  has  been  completed  which,  while  it  detects  up-  and 
down-curbs,  is  not  in  a  satisfactory  form  for  practical  use,  and  requires 
much  further  development. 

Following  is  a  summary  of  the  list  of  recommendations  contained  in 
the  evaluation  report  referred  to : 

1)  Separate  obstacle  and  curb  locators  which  are  quiescent,  i.e.,  de¬ 
liver  no  signal  until  needed,  and  which  are  not  accidentally  actu¬ 
ated  by  irregular  movements  of  the  instrument  encountered  during 
walking. 

2)  Two-channel  presentation  through  tactile  stimulators:  one  for  curb 
signals,  the  other  for  obstacle  signals. 

3)  Automatic  scanning  for  obstacle  detector,  about  a  3-foot  wide  path 
at  a  distance  of  8  feet.  (Manual  scanning  retained  for  simplicity.) 

4)  Maintain  extreme  simplicity  in  getting  instrument  into  operation. 
The  user  would  prefer  simply  to  pick  up  the  device,  “flip  a  switch,” 
and  be  ready  to  travel. 


Electronic  Travel  Aid 


171 


5)  Include  feature  to  aid  user  in  walking  a  straight  line.  How  this  may 
be  done  is  not  known.  It  is  listed  simply  to  encourage  thought  on 
the  subject. 

6)  No  moving  parts  except  tactile  stimulators. 

7)  A  dual-range  control  for  obstacle  detecting  (near  and  far). 

8 )  Distance  discrimination  to  within  one  foot  of  instrument. 

9)  Attempt  to  reduce  noise  (background  interference)  due  to  bright 
sunlight. 

10)  Weight  of  complete  instrument  under  four  pounds.  Storage  battery 
life  of  four  hours  is  sufficient. 

11)  Progressively  complex  models  to  accommodate  users  of  varying 
degrees  of  ability. 

Research  has  indicated  that  obstacle  detection  is  quite  feasible  but 
that  detecting  discontinuities  in  the  terrain  presents  numerous  problems  of 
such  magnitude  that  a  definite  solution  has  not  yet  been  reached.  Progress 
has  been  hampered  by  the  necessity  of  keeping  the  cost  of  the  ultimate 
device  within  reach  of  the  prospective  blind  user.  It  is  expected  that  sub¬ 
sidies  will  be  necessary,  but  even  subsidies  must  be  kept  at  a  reasonable 
level. 


THE  OBSTACLE  DETECTOR 

Figure  1  shows  the  appearance  of  the  obstacle  detector  that  is  currently 
being  evaluated  by  a  few  blind  users.  The  optical  principle  by  which  it 
detects  is  as  follows. 

Optical  Triangulation.  A  beam  of  light  is  emitted  by  a  light  source  and 
focused  on  the  obstacle  to  be  detected.  If  this  obstacle  is  not  a  perfectly  re¬ 
flecting  surface,  some  light  rays  will  be  reflected  in  all  directions.  In  particu¬ 
lar,  some  rays  will  be  returned  at  such  an  angle  that  they  will  pass  through 
the  receiving  lens  and  be  focused  on  the  receiving  photodetector.  If  now 
the  obstacle  is  moved  to  a  new  position,  it  can  be  seen  that  the  rays  reflected 
by  it  will  be  returned  to  a  different  point  in  the  focal  plane  of  the  receiving 
lens.  Another  photodetector  placed  in  this  position  will  detect  the  pres¬ 
ence  of  an  obstacle  in  this  position.  Thus  two  small  photodetectors  appro¬ 
priately  positioned  may  be  used  to  detect  obstacles  in  two  different  ranges. 
In  our  device,  the  ranges  are  3  to  5  feet  and  5  to  9  feet.  The  spacing  be¬ 
tween  the  lenses  determines  the  distance  the  returned  light  spot  will  move 
behind  the  receiving  lens  as  the  obstacle  moves,  that  is,  the  ranging  sensi¬ 
tivity. 


172  Man-Machine  Systems 


Figure  1  The  Haverford/Biophysical  Instrument  Obstacle  Detector 


Photodetector.  The  photodetector  required  for  this  purpose  must  meet  a 
number  of  specifications.  It  must  have  a  small  over-all  size  in  order  to  fit 
in  a  small  space,  and  its  sensitive  area  must  extend  very  nearly  to  one  edge 
so  that  two  may  be  placed  side-by-side  to  cover  adjacent  ranges  without 
losing  appreciable  light  between  them.  The  sensitive  area  must  be  small  to 
allow  use  of  a  small  cross-section  light  beam  and  to  keep  the  signal/noise 
ratio  high.  Our  device  has  a  beam  cross-section  of  about  one  inch  at  six 
feet.  It  must  have  a  spectral  response  compatible  with  that  of  the  light 
source  used.  If  it  is  too  broad  it  will  generate  unnecessary  noise. 

It  must  also  meet  several  electrical  requirements.  The  signal/noise  ratio 
must  be  high  because  a  small  sized,  lightweight  device  must  be  quite  con- 


Electronic  Travel  Aid 


173 


servative  of  power.  Its  light  response  must  be  linear  over  a  wide  range  of 
light  intensity  so  that  it  will  work  equally  well  at  night  and  in  bright  sun¬ 
light.  The  noise  that  it  generates  must  also  be  virtually  independent  of 
ambient  light  intensity  to  obviate  automatic  gain  control  circuitry.  And 
finally  it  must  have  response  to  5000  cps  to  detect  the  pulses  of  light  used 
by  the  system.  A  germanium  photodiode  answered  all  these  requirements. 

Lamp.  The  lamp  needed  for  the  job  had  to  be  made  especially  for  our  use. 
A  glow  discharge  type  of  lamp  had  the  general  characteristics  required;  so 
we  had  a  parallel  electrode  type  of  neon  pilot  light  constructed  with  a 
xenon  fill  instead  of  the  usual  neon  to  give  greater  response  in  the  infrared, 
thus  matching  the  response  of  the  germanium  photodiode. 

Optics.  Because  of  the  over-all  weight  and  power  limitations,  it  was  neces¬ 
sary  to  make  the  optical  system  small,  lightweight,  and  efficient.  The  light¬ 
gathering  power  of  a  lens,  of  course,  increases  with  increasing  size,  so  that 
theoretically  the  sending  and  receiving  lenses  should  together  fill  the  whole 
frontal  area  of  the  device.  An  actual  size  limit  is  set  by  the  minimum  F/ 
number  attainable  for  a  given  type  and  size  of  lens. 

The  focal  length  of  the  receiving  lens  is  set  by  the  required  ranging 
sensitivity  and  the  sensitive  area  of  the  photodetector.  Since  two  photo¬ 
detectors  were  used  to  cover  the  two  adjacent  ranges,  it  was  necessary  to 
choose  a  focal  length  for  the  receiving  lens  such  that  the  image  of  the 
source  would  move  completely  across  the  width  of  one  photodetector  as 
the  obstacle  moved  from  one  end  of  a  given  range  to  the  other.  These  con¬ 
siderations  dictated  a  focal  length  of  three  inches. 

The  requirement  on  the  focal  length  of  the  source  lens  is  that  the  image 
of  the  source  on  the  target,  when  reimaged  by  the  receiving  optics  on  a 
detector,  will  be  smaller  than  the  detector’s  sensitive  area.  For  the  light  source 
and  photodiode  employed,  equal  source  and  receiving  lens  focal  lengths 
yielded  adequate  range  sensitivity. 

Fresnel  lenses  were  used  instead  of  glass  lenses  or  reflectors  because 
such  elements,  1/1 6-inch  thick,  were  available,  are  extremely  rugged  in 
spite  of  their  small  weight,  and  can  be  made  with  sufficient  correction  for 
spherical  aberration  that  a  “faster”  lens  (i.e.,  of  larger  effective  aperture) 
results  than  is  obtainable  with  glass. 

There  are  some  types  of  externally  produced  light  likely  to  cause 
trouble:  sunlight,  incandescent,  and  fluorescent  light  (that  is,  60  cps  and 
120  cps  modulated  light),  and  transients  produced  by  reflection  (like  that 
of  sunlight  from  a  window  or  a  passing  automobile) . 

The  effect  of  changes  in  ambient  light  may  be  eliminated  by  using  a 


174  Man-Machine  Systems 

diode  whose  response  is  linear  over  a  wide  range  of  light  intensities  and 
employing  an  ac-coupled  amplifier. 

The  source  light  must  be  coded  in  some  way  to  distinguish  it  from  other 
light  viewed  by  the  receiver.  This  coding  may  take  the  form  of  sine  wave 
or  square  wave  modulation  at  a  frequency  sufficiently  distant  from  60  or 
120  cps  to  allow  easy  filtering.  For  optimum  signal/noise  ratio  the  receiving 
amplifier  should  have  a  very  sharp  pass  band  and  this  then  puts  considerable 
stability  demands  on  the  lamp  modulation  oscillator. 

Alternatively  the  lamp  may  be  pulsed  with  infrequent  high  energy  pulses, 
and  the  amplifier  gated  “on”  at  the  appropriate  times,  to  receive  the  returned 
light  pulses.  This  system  has  the  advantage  that  if  the  pulse  generator  drifts 
the  amplifier  will  follow.  It  is  only  necessary  to  pulse  the  lamp  as  often  as 
new  information  is  required,  in  our  case  about  20  times  per  second.  The 
amplitude  and  duration  of  the  pulse  must  be  such  as  to  produce  as  much 
energy  in  the  lamp  as  is  required.  The  shorter  the  duration  of  the  pulse, 
however,  the  more  time  the  amplifier  will  be  gated  “off,”  and  thus  the 
higher  will  be  its  transient  rejection  capabilities.  Thus  this  system  has  the 
further  advantage  over  a  continuously  modulated  system  that  it  is  capable 
of  rejecting  light  transients.  The  present  detector  has  a  theoretical  transient 
rejection  ratio  of  200  to  1 . 

The  functioning  of  the  instrument  may  be  summarized  as  follows.  A 
pulse  generator  produces  pulses  200  to  250  microseconds  wide  at  the  rate 
of  20  per  second.  These  pulses  are  amplified  and  used  to  excite  a  xenon 
glow  lamp.  The  resulting  light  flashes  are  of  the  same  duration  and  fre¬ 
quency  as  the  initiating  pulses. 

The  source  lens  focuses  the  light  into  a  beam.  Reflections  from  obstacles 
in  the  beam  are  caught  by  the  receiving  lens  and  focused  onto  a  pair  of 
photodiodes,  the  output  of  which  is  amplified  by  the  light  signal  amplifier. 

The  output  of  the  pulse  generator  is  also  amplified  by  the  gate  signal 
amplifier,  which  is  used  to  open  a  gate  at  the  correct  moment  and  to  re¬ 
ceive  any  pulse  from  the  light  signal  amplifier  which  could  have  resulted 
from  a  reflection  of  a  flash  from  the  xenon  lamp.  The  gate  is  closed  at  all 
other  times  to  discriminate  against  spurious  light  from  other  sources. 

A  light  signal  pulse  passed  by  the  gate,  if  large  enough,  will  trip  the 
one-shot  multivibrator.  When  triggered,  this  circuit  produces  a  pulse  of 
sufficient  magnitude  to  drive  the  stimulator,  causing  it  to  give  the  finger 
a  single  poke. 

Figure  2  shows  the  device  with  its  cover  removed.  The  cover  that 
has  been  removed  contains  an  access  door  to  allow  removal  of  the  lamp, 


Electronic  Travel  Aid 


175 


Figure  2  Rear  View  of  Obstacle  Detector  with  Cover  and  Handle  Removed 


176 


Man-Machine  Systems 


which  is  located  behind  the  source  lens.  This  lamp  is  mounted  in  a  pre¬ 
focused  base  to  facilitate  replacement. 

The  returned  light,  after  passing  through  the  receiving  lens,  is  directed 
by  a  front-surfaced  mirror  onto  the  near-  or  tar-range  photodiode.  The 
mirror  is  introduced  to  save  space.  Depressing  the  Range  Switch  in  the 
handle  removes  the  far-range  photodiode  from  the  circuit  to  facilitate 
traveling  in  crowded  locales. 

The  output  from  the  photodiodes,  after  suitable  amplification,  actuates 
the  stimulator  which  pokes  the  index  linger  through  a  small  hole  in  the 
handle. 

The  instrument  is  powered  by  a  12-volt  rechargeable  battery  which 
supplies  33  milliamperes  with  the  stimulator  running.  The  average  lamp 
power  required  is  approximately  130  milliwatts,  including  the  lamp  drive 
circuit.  The  instrument  is  sufficiently  sensitive  that  a  mirror  placed  a  quar¬ 
ter-mile  away,  producing  a  light  path  one-half-mile  long,  will  return  suffi¬ 
cient  signal  to  saturate  the  amplifier. 


REFERENCES 

1.  Benham,  Thomas  A.  Evaluation  of  Signal  Corps  Sensory  Aid  for  the  Blind , 

AN/PVQ  (XE-2).  Washington,  D.  C.:  Veterans  Administration,  1952.  (Report 
of  25  April  1952.  Contract  No.  V100LM-1900.) 

2.  Cranberg,  Lawrence  C.,  “Sensory  Aid  for  the  Blind,”  Electronics,  Vol.  19 

(March  1946),  p.  116. 

3.  Kallmann.  H.  E.,  “Optar,  a  Method  of  Optical  Automatic  Ranging  as  Applied 

to  a  Guidance  Device  for  the  Blind,”  Proc.  I.R.E.,  Vol.  42  (1954).  pp.  1438- 
1446. 

4.  Roberts,  A.,  “Ultrasonic  Echo  Sounding  Equipment  for  the  Blind."  Radio  News, 

(June  1949),  p.  6. 

5.  Slaymaker,  F.,  and  W.  F.  Meeker.  “Blind  Guidance  by  Ultrasonics,”  Electronics, 

Vol.  21  (May  1948),  p.  76. 

6.  Twersky,  V.,  “Auxiliary  Mechanical  Sound  Sources  for  Obstacle  Perception  by 

Audition.”  J.  Aeons.  Soc.  Amer.,  Vol.  25.  No.  1  (January  1953),  pp.  156-157. 

7.  Witcher,  C.  M.,  and  L.  Washington.  Jr..  “Echo  Location  for  the  Blind,”  Elec¬ 

tronics,  Vol.  27,  No.  12  (December  1954),  pp.  136-137. 

8.  Zahl.  Paul  A.  Blindness:  Modern  Approaches  to  the  Unseen  Environment.  Prince¬ 

ton:  Princeton  University  Press,  1950. 


ACTIVE  ENERGY  RADIATING  SYSTEMS: 
THE  ELECTRONIC  CANE 


ROBERT  J.  GIBSON 

The  Franklin  Institute,  Philadelphia,  Pennsylvania 


Since  about  1948  the  Franklin  Institute  has  been  engaged  in  the  evalu¬ 
ation  and  the  development  of  an  electronic  cane.  This  cane  possesses  the 
normal  functions  of  a  primitive  stick;  but  it  is  enhanced  through  elec¬ 
tronics  by  the  additional  ability  to  warn  its  user  of  step-downs  in  his  path 
without  tapping  the  tip  of  the  cane.  This  program  has  been  supported 
since  its  inception  by  the  W.  K.  Kellogg  Foundation,  and  more  recently 
by  the  Office  of  Vocational  Rehabilitation  of  the  United  States  Depart¬ 
ment  of  Health,  Education,  and  Welfare.  During  this  time  our  ideas  of  the 
requirements  of  a  travel  aid  of  this  kind,  and  embodiments  of  them,  have 
passed  through  many  stages  and  models.  I  will  not  go  into  historical  de¬ 
tails,  however;  rather,  I  will  describe  some  of  our  present  work. 

First  of  all,  an  electronic  cane  should  have  the  physical  attributes  of 
a  normal  cane;  it  should  give  only  a  little  more  information  than  a  simple 
stick;  it  should  be  normally  silent  and  it  should  be  as  “fail-safe”  as  possible. 

The  primitive  cane,  which  is  a  simple  extension  of  the  arm,  is  still  the 
most  successful  and  popular  guidance  device  now  in  existence.  Any  newer 
device  should  contain  as  many  properties  of  a  stick  as  possible,  adding  only 
new  useful  information. 

Second,  after  an  enormous  amount  of  advice  and  discussion  with  all 
those  concerned  with  the  problems  of  the  blind,  we  feel  that  step-down 
detection  is  one  of  the  most  important  problems  to  solve;  so  our  cane 
detects  step-downs.  This  is  a  single  piece  of  information:  it  doesn’t  tell 
how  deep  the  step-down  is.  It  says,  “The  ground  has  fallen  away,  so  take 
appropriate  action.”  To  give  just  a  single  piece  of  information  here  is 
important;  if  you  give  too  much  information  the  user  will  become  con¬ 
fused.  In  addition,  it  is  easy  to  learn  to  respond  to  a  single  piece  of  in¬ 
formation  so  that  the  evasive  action  may  be  taken  quickly. 

Third,  I  mentioned  the  “normally  silent”  operation.  We  believe  this 
is  important  because  a  continuous  stimulus  fed  to  the  user  will  either  be 


177 


178  Man-Machine  Systems 

ignored,  or  it  will  interfere  with  other  cues  from  the  environment.  A  con¬ 
tinuous  stimulus  becomes  a  hindrance  rather  than  a  help.  A  single  strong 
stimulus  warning  of  danger  is  usually  met  with  the  appropriate  avoiding 
activity. 

Fourth,  the  “fail-safe”  feature.  It  should  be  almost  unnecessary  to 
mention  that  this  is  a  necessary  inclusion;  if  a  cane  with  additional  elec¬ 
tronics  gives  additional  information,  it  is  still  a  cane  and  should  be  usable 
as  such.  Further,  while  you  can’t  make  the  electronics  fail  in  a  way  that 
is  safe,  you  can  make  them  as  reliable  as  possible;  you  can  make  it  easy 
to  check  them.  These  refinements  have  been  incorporated  into  our  cane. 

As  you  have  probably  guessed,  these  were  not  a  priori  criteria  that  we 
laid  down  at  the  beginning  of  our  work,  but  what  we  found  necessary  in 
the  course  of  our  work.  In  retrospect  these  ideas  look  very  good  to  us, 
especially  since  our  cane  conforms  to  all  of  these  requirements. 

What  of  the  physical  configuration  of  the  cane?  It  has  a  hollow  tapered 
shaft  about  three  feet  long,  measured  from  the  end  of  the  handle.  The 
shaft  is  about  five-eighths  inch  diameter  at  the  upper  end,  and  about  three- 
eighths  inch  at  the  lower,  with  a  hardened  metal  tip.  The  handle  is  some¬ 
what  bulky  compared  to  the  normal  cane,  and  is  about  nine  inches  long, 
of  high-impact-resistance  plastic.  The  handle  contains  all  the  electronic 
equipment;  has  a  hook  at  the  rear  which  can  be  put  over  the  arm;  has  a 
cross-section  area  about  the  same  as  an  ordinary  flashlight;  and  has,  at 
the  forward  end,  a  small  knob  which  is  used  for  tuning  the  cane  or  ad¬ 
justing  the  distance  at  which  the  tip  is  carried  from  the  ground.  At  the 
rear  is  another  knob  which  is  used  for  turning  on  the  electronics  and  for 
adjusting  the  volume  of  the  auditory  signal  if  that  is  desired.  On  the  side 
of  the  cane,  approximately  at  the  fleshy  part  of  your  hand  adjacent  to  your 
little  finger,  there  is  a  small  protruding  pin.  This  pin  gives  a  tactile  stimulus 
and  tells  the  user  that  the  tip  of  the  cane  is  more  than  two  inches  or  so 
from  the  ground.  Actually,  the  cane  can  be  adjusted  so  that  it  will  give 
no  signal  from  one  inch  from  the  ground  up  to  six  inches  from  the  ground. 
We  have  found  that  two  inches  from  the  ground  is  the  best  distance  to  use. 

There  is  a  place  to  plug  in  the  earpiece  and  there  is  a  place  to  plug 
in  the  charging  cord.  The  charging  cord  has  an  isolation  transformer  to 
avoid  electric  shock;  this  could  be  incorporated  into  the  charging  accessory. 

The  chief  advantage  of  this  cane  is  that  it  sweeps  a  zig-zag  path  about 
three  inches  wide  ahead  of  the  traveler,  whereas  a  tapping  stick  only 
samples  a  tiny  spot  on  the  pavement  or  street  where  it  actually  touches  the 
ground.  Tapping  with  the  electronic  cane  is  unnecessary  until  a  step-down 
is  detected;  then  we  suggest  its  use  as  a  cane  and  as  a  probe. 


Electronic  Cane 


179 


A  false  signal,  defined  as  one  which  comes  from  raising  the  tip  of  the 
cane  above  the  two  inches,  is  to  be  avoided.  Under  normal  conditions  no 
signal  is  received  unless  the  distance  becomes  greater  than  this  set  distance. 

The  tactile  signal  is  relatively  quiet  when  the  hand  is  covering  it;  it 
vibrates  at  about  10  cps. 

Now,  our  original  design  included  just  the  auditory  signal;  we  did  not 
have  a  tactile  signal.  We  felt  that  it  was  most  important  to  give  a  tactile 
signal  for  several  reasons,  among  them:  1)  so  that  the  auditory  signal 
would  not  mask  any  other  cues  in  the  vicinity;  2)  so  that  the  signal  would 
not  be  obliterated  by  heavy  traffic  noises;  and  3)  to  eliminate  the  dress¬ 
ing-up  and  cosmetic  complications  of  an  earpiece.  The  tactile  signal  has 
proved  to  be  quite  successful.  The  response  to  a  drop-off  is  quick. 

This  cane  does  not  detect  up-obstacles  of  any  kind  with  its  electronics; 
it  does  detect  them  in  the  normal  fashion — by  bumping  into  them  as  any 
stick  would  do. 

The  description  of  the  electronics  is  as  follows.  This  is  essentially  a 
proximity  device.  It  will  detect  a  distance  greater  than  the  set  distance 
(usually  two  inches)  from  any  grounded  surface  including  asphalt,  con¬ 
crete,  or  wooden  boards  in  contact  with  the  ground.  There  are  two  oscil¬ 
lators  in  the  front  of  the  cane.  They  are  tuned  to  about  two  megacycles. 
One  of  these  has  a  tuned  circuit  which  includes  the  tip  of  the  cane  as 
one  plate  of  the  capacitor.  The  other  plate  of  the  capacitor  is  the  ground, 
the  circuit  is  completed  through  the  user’s  body.  The  second  oscillator 
has  a  capacitor  which  is  connected  to  the  tuning  knob  in  the  front,  and 
this  can  be  adjusted  so  that  the  two  oscillators  are  operating  at  the  same 
frequency.  When  they  are  at  the  same  frequency  there  is  no  signal.  When 
the  tip  of  the  cane  moves  away  from  the  ground,  or  the  ground  moves  away 
from  the  tip  of  the  cane,  the  capacity  decreases,  the  frequency  increases, 
and  this  difference  in  the  frequency  of  the  two  oscillators  is  detected, 
amplified,  and  heard.  This  is  further  amplified  to  turn  on  the  motor.  Inside 
are  some  circuit  boards  which  resemble  those  of  a  six-transistor  pocket 
radio.  There  is  a  battery  in  the  handle,  a  rechargeable  nickle-cadmium 
cell  of  10.8  volts  and  250  milliampere-hour  capacity.  There  is  a  small 
motor  an  inch  and  a  half  in  diameter,  a  little  less  than  an  inch  long.  There 
is  a  charging  diode  and  dropping  resistor  so  that  the  cane  can  be  plugged 
into  any  110  volt  ac  main  to  charge  the  battery  (the  cane  may  be  left  on 
“charge”  indefinitely  without  harming  the  battery).  The  circuit  draws  about 
8  ma  when  the  motor  is  not  operating  (and  it  is  usually  not  operating;  it  is 
only  operating  when  the  ground  falls  away  from  the  tip  of  the  cane).  The 


1 80  Man-Machine  Systems 

current  rises  to  between  100  to  200  milliamperes  when  the  motor  is  on. 
This  is  a  relatively  high  drain,  but  it  normally  lasts  only  for  a  second  or  two 
and  the  energy  consumed  is  relatively  small. 

Many  problems  arose  in  designing  this  cane.  One  was  the  selection  of 
tactile  stimulus.  We  went  to  the  literature  to  find  out  what  was  the  best 
frequency,  the  best  amplitude,  the  best  place,  and  so  forth.  Most  of  the  in¬ 
formation  was  not  too  useful.  The  literature  usuallv  talked  about  the  most 

j 

sensitive  areas  of  stimulus,  like  the  tip  of  your  finger — which  meant  you 
had  to  place  your  finger  in  the  proper  position  on  the  pin.  It  gave  optimum 
frequencies — but  usually  at  low  levels;  and  so  forth.  We  had  to  produce  this 
stimulus  so  that  it  could  always  be  felt,  even  through  gloves,  for  use  in  the 
winter  time.  We  found  the  stimulus  took  at  least  one-half  watt  of  electrical 
energy  into  a  reasonably  efficient  electromechanical  device  to  be  felt.  So 
we  tried  many  other  solutions.  This  motor  is  the  outcome  and  it  is  quite 
successful. 

It  takes  about  one-half  to  one  and  one-half  watts,  depending  on  how 
tightly  the  hand  is  pressed  against  the  pin.  I  have  used  the  word  pin; 
actually,  it  is  a  little  rod  with  a  rounded  end.  It  is  5/32  of  an  inch  in  diameter, 
and  it  has  a  maximum  stroke  of  about  3/32  of  an  inch,  moving  in  and  out 
of  a  hole  in  the  side  of  the  cane.  It  can  be  placed  on  the  right  side  or  the 
left  side  simply  by  drilling  a  hole  in  the  case.  Even  left-handed  persons  can 
operate  the  cane  by  holding  the  hand  in  a  slightly  different  position  to  catch 
the  little  finger  (if  you  are  left  handed  we  recommend  that  you  put  it  on  the 
other  side). 

We  have  used  rugged  materials  throughout.  As  I  said,  it  is  a  fiberglass 
plastic  bonded  shaft;  it  is  practically  indestructible.  Of  some  100  we  have 
had  and  tried  under  various  circumstances,  I  think  we  managed  to  break 
one.  Once  in  a  while  the  metal  tip  is  knocked  off,  but  this  is  replaceable. 
The  electronics  are  built  on  circuit  boards  and  coated  with  a  material  which 
holds  them  in  place.  The  most  vulnerable  things  are  the  little  knobs  and  these 
could  easily  be  recessed. 

The  cane  weighs  just  two  pounds,  which  is  considerably  heavier  than 
an  ordinary  wooden  stick,  but  it  is  very  well  balanced.  When  you  grasp  it 
in  your  hand  it  is  very  comfortable,  and  falls  naturally  in  place.  It  does 
not  cause  fatigue  in  use  and  some  people  prefer  to  use  this  rather  than  an 
ordinary  stick  even  when  the  electronics  are  not  operative  because  it  is  so 
easy  to  carry. 

With  previous  canes,  not  in  this  exact  configuration,  and  with  much 
inferior  circuitry  (using  tubes),  we  have  had  very  good  reports.  We  have 


Electronic  Cane 


181 


trained  about  80  people  to  use  the  previous  models,  and  in  spite  of  all  the 
difficulties  and  failures  of  the  electronics,  about  three-quarters  of  this  num¬ 
ber  found  that  they  could  learn  readily  and  quickly,  and  travel  safely  and 
rapidly,  with  it.  Approximately  10  percent  were  disgusted  with  it,  would 
have  nothing  to  do  with  it.  This  I  think  is  actually  a  very  good  report. 

We  recommend  that  approximately  a  half-hour  training  session  be 
given,  either  with  a  blind  instructor  or  sighted  instructor.  About  five  of 
these  half-hour  sessions,  spaced  over  a  period  of  a  week,  plus  about  two 
weeks  practice  under  minimum  supervision  would  complete  the  training 
program  for  almost  anyone.  Some  people  learn  to  use  the  cane  successfully 
with  a  half-hour’s  instruction  and  practice,  and  have  gone  out  and  used  it. 
Whenever  confusion  arises  or  anything  happens  to  the  electronics,  it  is  still 
a  cane. 

This  model,  produced  for  us  by  the  Stromberg  Carlson  Division  of  the 
General  Dynamics  Corporation  over  the  last  several  years,  has  resulted  in 
about  100  of  these  canes.  The  first  35  were  experimental;  there  were  a  lot 
of  “bugs”  to  be  worked  out.  About  65  canes  are  now  of  uniform  construc¬ 
tion,  uniform  electronics,  and  operate  in  a  very  similar  manner.  The  best 
50  of  these  have  been  selected,  and  32  are  now  out  in  the  hands  of  in¬ 
structors  who  are  not  under  our  direct  supervision,  for  carrying  out  what  we 
call  a  final  evaluation  program.  We  hope  that  perhaps  three  people  will  be 
trained  for  each  cane  that  is  out  in  the  field,  and  that  will  give  us  a  cross- 
section  of  100  people  who  have  learned  how  to  use  it  or  tried  to  learn  how 
to  use  it. 

There  are  three  things  we  want  to  find  out  from  the  final  program: 
( 1 )  can  and  does  the  blind  traveler  travel  better  with  the  electronic  cane 
than  with  an  ordinary  cane,  (2)  does  he  accept  it  and  want  it  and  does  the 
instructor  accept  it  and  want  it,  and  (3)  what  is  the  best  way  to  teach  its 
use?  We  hope  this  program  will  help  us  to  work  out  a  standard  instruction 
program.  We  have  developed  simple  questionnaires  to  be  used  for  this 
which  we  hope  will  simplify  the  task;  it  is  a  necessary  and  difficult  task  to 
answer  many  questions  about  how  the  cane  is  used  and  how  it  is  taught,  so 
we  hope  to  have  simplified  this. 

This  cane  is  not  a  cure-all  for  the  blind  traveler.  Rather,  I  believe  it  has  a 
useful  place  in  the  hierarchy  of  mobility  devices.  It  is  not  and  cannot  be  a 
final  and  perfect  device  or  answer  to  the  problems  of  guidance,  but  if  it  aids 
a  significant  number  of  the  blind  to  travel  more  easily  than  with  a  primitive 
stick  we  will  feel  rewarded  for  our  efforts. 


PASSIVE  SYSTEMS:  A  PROPOSED 
STEREO-OPTICAL  EDGE  DETECTOR* 


AVERY  R.  JOHNSON 

Bio-Dynamics,  Inc.,  Cambridge,  Massachusetts 


In  the  remarks  which  follow  I  shall  be  as  brief  as  possible.  One  reason  for 
this  brevity  is  that  the  device  I  am  going  to  describe  is  not  yet  fully  devel¬ 
oped.  This  prototype  is  in  fact  the  original  laboratory  model,  the  only  model 
in  existence. 

The  idea  for  this  device  came  to  mind  while  I  was  attending  the  Mobility 
Conference  co-sponsored  by  the  American  Foundation  for  the  Blind  and  the 
Veterans  Administration  at  Massachusetts  Institute  of  Technology  in  Octo¬ 
ber  of  1961.  It  seemed  so  simple  that  I  could  scarcely  believe  that  no  one 
had  tried  it  before.  One  of  my  purposes  in  making  this  description  is  to  find 
out  whether  any  of  you  know  of  similar  work  in  the  past.  When  you  invent 
something  there  is  a  certain  excitement  in  the  invention;  but  there  is  also  the 
responsibility  of  seeing  it  through — and  no  one  relishes  undertaking  a  task 
that  has  already  been  done. 

There  is  nothing  magical  about  the  device;  it  merely  detects  near-by  con¬ 
trast  boundaries.  The  device  is  built  into  a  stereo  camera.  If  you  take  a  photo¬ 
graph  of  a  distant  scene  with  such  a  camera,  the  images  found  in  the  image 
planes  are  identical.  If  there  is  an  object  near  by,  however,  different  images 
will  be  found  at  corresponding  points  in  the  image  planes  due  to  parallax. 

Let  us  now  substitute  a  mosaic  of  photosensitive  elements  in  the  two 
image  planes  in  place  of  film.  The  light  energy  falling  on  these  correspond¬ 
ing  elements  in  the  two  planes  will  be  identical  for  a  distant  object.  If  there  is 
a  contrast  boundary  near  by,  however,  the  light  energy  will  be  different  on  at 
least  two  cells,  if  the  mosaic  is  fine  enough.  The  mosaic  “grain”  may  be 
made  much  coarser  if  one  allows  scanning  of  the  object  with  the  contrast 
boundary.  The  parallax  of  the  contrast  boundary  will  then  pass  over  each 
corresponding  pair  of  photocells  in  turn. 

The  result  is  a  difference  in  the  light  falling  on  corresponding  photo- 


*  This  paper  has  been  prepared  from  a  revised  version  of  an  oral  presentation — Ed. 

183 


184  Man-Machine  Systems 

sensitive  elements  in  the  two  image  planes.  In  terms  of  hardware,  the  first 
suggestion  that  comes  to  mind  is  to  connect  pairs  of  photocells  as  arms  of  a 
bridge  circuit,  to  drive  the  bridges  with  separate  audio  oscillators,  and  to 
listen  to  the  sum  of  the  outputs.  I  shall  not  enter  here  the  controversy  of 
aural  versus  tactile  output,  except  to  say  this:  I  have  used  this  device  while 
wandering  around  in  various  environments  and  found  virtually  no  mask¬ 
ing.  One  attractive  feature  of  the  system  is  that,  at  least  theoretically,  there 
is  an  output  only  when  there  is  an  object  in  the  near  field. 

In  practice,  one  holds  the  camera  in  the  horizontal  plane,  and  scans  in 
the  horizontal  plane.  Any  near-by  edge  will  be  detected  if  it  has  a  vertical 
component.  To  detect  horizontal  edges,  the  camera  is  tipped  up  vertically 
and  scanned  vertically.  By  ‘‘horizontal”  I  mean  holding  the  camera  in  such 
a  way  that  the  plane  of  the  lens  systems  is  parallel  with  the  ground;  the 
device  is  scanned  in  the  same  plane. 

Note  that  the  detection  of  boundary  condition  is  virtually  independent 
of  over-all  light  level  because  we  are  looking  only  for  differences  in  the 
light  falling  on  corresponding  pairs  of  photocells.  Thus  the  device  will  work 
as  well  in  reasonably  dim  light  as  in  bright  light,  provided  the  photocells  are 
well  matched. 

In  this  model  Clairex  605-1  photocells  have  been  used.  The  6-series  of 
cells  are  small,  about  one-quarter  inch  in  diameter.  Three  of  these  are  used 
in  each  plane.  The  spectral  response  of  the  cells  is  close  to  that  of  the  human 
eye.  Unfortunately  these  photoresistors  are  not  identical  by  pairs.  In  the 
type  of  bridge  circuit  used  in  this  device,  one  can  balance  the  output  of  the 
bridge  with  a  potentiometer  in  the  customary  way,  that  is,  match  the 
average  resistance  of  the  cells.  With  a  second  potentiometer  it  is  possible  to 
match  the  first  derivative.  If  it  is  desired  to  match  the  curves  even  more 
closely,  one  must  go  to  more  and  more  complicated  controls.  Another 
problem  is  that  when  the  voltage  on  the  cell  is  varied  for  a  fixed  light  input, 
the  resistance  is  not  constant.  This  condition  obtains  in  the  present  device 
since  the  cells  are  driven  with  an  audio  oscillator.  The  result  is  that  second 
harmonics  are  generated  when  a  state  of  imbalance  of  light  energy  is  en¬ 
countered. 

The  output  of  the  device  with  all  bridges  imbalanced  is  a  chord  com¬ 
posed  of  a  fundamental,  third,  and  fifth.  On  the  demonstration  tape  the 
second  harmonics  mentioned  are  audible.  These  can  be  reduced  in  intensity 
by  filtering  the  output,  or  eliminated  by  using  more  closely  matched  cells. 
In  any  case,  it  may  turn  out  that  audio  bridges  are  not  useful.  We  would 
then  have  to  drive  the  cells  with  a  dc  bridge,  detecting  a  dc  signal  for  trans¬ 
formation  into  another,  more  easily  detectable  form. 


Stereo-Optical  Edge  Detector  185 

The  Clairex  cells  used  have  a  small,  rectangular  sensitive  area  with 
interdigitated  electrodes,  that  is,  the  photosensitive  substance  zig-zags  back 
and  forth.  I  have  masked  the  entire  cell,  save  for  a  narrow  slit  perpendic¬ 
ular  to  the  electrodes.  Thus  there  is  an  almost  homogeneous  distribution 
of  small  sensitive  areas  with  electrodes  between  them.  The  dimension  of 
“verticality”  is  in  the  direction  of  the  edges  that  one  tries  to  detect  in  any 
case. 

This  design  is  a  first  try.  Perhaps  the  limitations  of  the  geometry  of  the 
photosensitive  surface  can  be  turned  to  advantage  with  clever  design.  I 
would  welcome  any  suggestions. 

At  night  or  indoors  the  hand  can  be  held  over  one  lens  system  to 
create  a  simple  light  probe  to  locate  light  bulbs,  open  doorways,  or  auto¬ 
mobile  headlights;  it  could  also  permit  the  examination  of  the  outlines  of 
lighted  store  windows,  etc. 

One  further  improvement  or  extension  suggests  itself.  The  pairs  of 
photocells  (in  corresponding  locations  in  each  of  the  frames)  could  be 
selected  from  among  types  of  cells  with  differing  spectral  responses.  In  this 
way,  the  ambiguity  between  hue  and  brightness  could  be  avoided:  if  there 
were  a  boundary  which  had  no  brightness  contrast  of  hue  on  two  sides, 
some  set  of  cells  would  pick  it  up.  Further,  with  cells  of  different  spectral 
response,  if  one  plane  is  occluded  or  is  illuminated  by  a  reference  light  in¬ 
side  the  device,  the  other  plane  would  permit  an  appreciation  of  the  color 
of  an  object  in  terms  of  the  differing  intensities  of  the  graded  output  of  the 
three  channels.  Eventually,  if  this  technique  works,  one  could  package  the 
entire  device  in  a  very  small  box.  The  only  dimension  of  this  prototype 
which  need  be  preserved  is  the  center-to-center  distance  between  the  lens 
systems — and  perhaps  lens  systems  aren’t  even  necessary.  A  pinhole 
camera  would  work  as  well  if  the  photocells  are  responsive  enough.  The 
microelectronics  could  be  packaged  between  the  lens  systems.  The  device 
could  then  be  held  in  the  palm  of  the  hand  and  the  output  fed  through  a 
fine  wire  into  a  single  small  earphone. 

TAPE-RECORDED  DEMONSTRATION 

I  had  some  recent  tape-recorded  examples  made,  not  from  the  “normal 
pedestrian  environment,”  but  from  situations  of  rather  extreme  contrast, 
to  show  more  clearly  the  operation  of  the  device.  The  only  parameter  that 
could  not  be  recorded  is,  of  course,  the  nature  of  the  scanning  rate  and 
pattern.  It  will  be  obvious  in  some  cases;  in  others,  I  pantomimed  the 
speed  and  direction  of  scanning.  The  terms  “bright  doorway"  and  “bright 
slit”  refer  respectively  to  a  darkened  room  with  sunlight  coming  through 


186  Man-Machine  Systems 

an  open  doorway,  and  the  same  door  three  inches  ajar.  The  sidewalk  men¬ 
tioned  in  the  outdoor  examples  was  not  a  real  sidewalk  but  a  reasonable 
facsimile  for  recording  purposes. 

Example  1 :  An  open  bright  doorway,  scanned  slowly  back  and  forth 
from  six  feet  away. 

Example  2:  The  same  doorway,  scanned  at  the  same  rate,  from  20  feet 
away. 

Example  3:  Walking  slowly  toward  a  three-inch  wide,  bright  slit  from 
20  feet  away,  and  ending  at  the  slit  itself. 

Example  4:  Scanning  a  far-off,  bright  outdoor  environment,  aiming 
lower  and  lower,  followed  by  a  look  at  my  own  shadow.  (This  example 
showed  some  of  the  imbalance  problem.  The  bridges  were  difficult  to  bal¬ 
ance  in  bright  light.) 

Example  5:  Walking  along  the  last  50  feet  of  a  bright  sidewalk  toward 
a  dark  road;  scanning  vertically  from  the  horizon  to  my  feet  and  back  to 
the  horizon  again.  The  boundary  between  the  sidewalk  and  the  road 
gradually  becomes  clearer  as  it  is  approached. 

Example  6:  A  bright  light  bulb  against  a  neutral  background.  This  might 
also  be  an  automobile  headlight. 

Example  7:  A  bright  light  bulb  against  a  neutral  background.  One  lens 
system  is  obscured.  (The  bulb  is  somewhat  easier  to  find  in  this  case.  The 
light  is  picked  up  before  actually  aiming  at  the  bulb.  One  is  not  comparing 
the  energy  received  with  another  sample  from  the  same  direction;  rather 
one  is  comparing  it  with  a  dark  background. ) 


PASSIVE  SYSTEMS:  AMBIENT-LIGHT 
OBSTACLE  DETECTOR  WITH 
TACTILE  OUTPUT* 

BERTIL  JACOBSON 

Karolinska  Institutet,  Stockholm,  Sweden 


Obstacle  detectors  for  the  blind  are  either  active  or  passive  in  type.  Active 
ones  emit  some  kind  of  radiation  and  intercept  the  energy  reflected  by  the 
obstacle.  Because  of  the  emitter,  the  weight  and  power  consumption  of 
such  systems  should  therefore  be  greater  than  for  passive  systems  employing 
ambient  light,  if  the  receiving  circuits  for  an  active  and  a  passive  system  are 
of  comparable  size. 

The  purpose  of  this  investigation  has  been  to  explore  the  possibilities  of 
an  obstacle  detector  employing  ambient  light.  Two  laboratory  models  have 
been  built  and  tested  with  the  purpose  of  evaluating  the  ambient  light  prin¬ 
ciple  for  obstacle  detection  and  to  compare  it  with  other  available  methods. 

PRINCIPLE 

The  ambient  light  principle  employed  is  a  further  development  of  the 
Kallmann  Optar  device  (2).  It  relies  on  the  fact  that  a  lens  or  concave 
mirror  focuses  a  sharp  image  of  an  object  in  one  plane  only.  The  distance 
between  the  lens  and  the  image  is  a  function  of  the  distance  from  the  lens 
to  the  object.  Thus  the  practical  problem  is  to  find  a  method  of  indicating 
when  an  image  is  focused  sharply  on  a  plate;  an  obstacle  in  front  of  the 
lens  can  then  be  detected  and  its  distance  determined. 

In  the  most  efficient  designs  so  far  tested  by  us,  the  existence  of  a 
sharp  image  is  detected  in  the  following  way.  The  light  focused  on  a  plate 
scans  it  by  means  of  one  or  two  vibrating  mirrors  (Figures  1  and  4).  Light 
leaks  through  a  small  aperture  in  the  plate  and  falls  upon  a  photomultiplier. 
When  a  sharp  image  is  focused  on  the  plate,  transients  are  obtained  in  the 


*  This  investigation  was  supported  by  a  research  grant  from  the  Swedish  Tech¬ 
nical  Research  Council. 


187 


188  Man-Machine  Systems 

multiplier  current.  When  the  image  is  out  of  focus,  no  sharp  light  gradients 
sweep  over  the  aperture  and  no  transients  occur.  The  transients  activate 
a  tactile  output  system.  Ranging  is  accomplished  by  varying  the  lens-plate 
or  mirror-plate  distance. 

DESIGNS 

In  one  design  (Figure  1),  a  concave  mirror  was  employed  for  focusing  the 
image  on  the  plate.  This  made  it  possible  to  save  weight  and  to  use  large 
optical  apertures.  The  plate  was  scanned  in  one  direction  only,  using  one 
vibrating  mirror.  In  this  system,  distance  ranging  can  be  obtained  by  manual 
displacement  of  the  mirror,  each  setting  of  the  mirror  corresponding  to  a 
certain  distance. 


Figure  1  Cross-Section  of  Obstacle  Detector  Employing  Concave  Mirror 

Optics  and  One  Vibrating  Mirror  for  Image  Scanning 

The  output  system  used  is  a  bistable  mechanical  flip-flop.  It  has  the 
advantage  of  not  consuming  any  power  in  the  two  stable  positions.  Energy 
is  consumed  only  when  the  flip-flop  changes  position.  One  position  corre¬ 
sponds  to  the  presence  of  an  obstacle  at  which  time  the  pin  of  the  mechanical 
flip-flop  can  be  felt  by  the  finger  as  a  protruding  peg.  The  other  position 
is  activated  when  there  is  no  obstacle  present. 

The  electronic  circuit  operates  in  the  following  way  (Figure  2).  To 
allow  for  varying  conditions  of  illumination,  the  high  voltage  to  the  photo¬ 
multiplier  is  automatically  regulated  to  give  a  constant  average  current. 


Ambient-Light  Obstacle  Detector 


189 


Figure  2  Block  Diagram  of  Electronic  Circuits 

Rapid  variations  in  the  control  current  arise  from  light  transients,  and  they 
are  amplified  and  differentiated.  Positive  and  negative  differentials  are  ob¬ 
tained,  depending  on  whether  the  aperture  is  scanned  with  an  image  going 
from  a  lighter  to  a  darker,  or  from  a  darker  to  a  lighter  area.  To  obtain  a 
response  for  both  types  of  signals,  the  differentials  are  rectified.  To  sort  out 
slow  transients,  which  originate  from  images  slightly  out  of  focus,  a  rise 
time  analyzer  is  used  to  block  the  circuit  if  the  transients  are  not  fast 


Figure  3  The  Laboratory  Model  of  the  Obstacle  Detector  Is  Built  Into 
a  Brief  Case 


190  Man-Machine  Systems 

enough.  Thus,  only  transients  of  the  correct  rise  time  can  trigger  the  me¬ 
chanical  flip-flop. 

The  relative  optical  aperture  of  the  concave  mirror  is  1  to  1.8,  the  focal 
length  9  cm  and  the  diameter  of  the  aperture  in  the  plate  is  0.03  cm.  The 
detector  sweeps  a  horizontal  field  of  about  7  degrees  by  means  of  the 
vibrating  mirror. 

The  system  is  built  into  a  brief  case  (Figure  3).  The  obstacle  detector 
is  switched  “on”  by  pressing  a  lever  with  the  little  finger,  and  the  output 
peg  can  be  felt  with  the  thumb.  The  weight  of  the  present  nonminiaturized 
design  is  10  pounds,  the  volume  0.5  cubic  foot,  and  the  power  consumption 
1.2  watts. 

Automatic  ranging  has  been  obtained  in  another  system  by  changing 
the  lens-plate  distance  cyclically  (Figure  4).  The  plate  on  which  the  image 
is  focused  is  oscillated  between  the  points  corresponding  to  distances  be¬ 
tween  ten  meters  and  half  a  meter.  The  output  transducer  consists  of  four 
vibrating  plates — one  for  each  of  the  distance  ranges  used:  0.5  to  1.5; 
1.5  to  3.0;  3.0  to  5.0;  and  5.0  to  10.0  meters  (see  Figure  5). 


Figure  4  Principle  for  Obstacle  Detector  with  Automatic  Distance  Rang¬ 
ing  and  with  Image  Scanning  in  Two  Directions  by  Two  Vibrating  Mirrors 


191 


Ambient-Light  Obstacle  Detector 


Figure  5  Laboratory  Model  of  Obstacle  Detector  with  Automatic  Rang¬ 
ing.  Each  of  the  four  vibrating  plates  in  the  handle  corresponds  to  a  certain 
distance  range. 


DISCUSSION 

The  sensitivity  of  the  instruments  is  satisfactory  under  average  daylight 
conditions  and  good  artificial  illumination,  but  further  development  would 
be  required  to  determine  the  efficiency  limits  under  poor  conditions  of  il¬ 
lumination.  The  instruments  can  detect  any  object  with  contours  or  con¬ 
trasts,  but  it  is  unable  to  respond  to  an  even  surface  without  contrasts.  In 
this  respect  the  ambient  light  principle  is  inferior  to  the  infrared  obstacle 
detector  developed  by  Benham  and  Benjamin  ( 1 ) . 

The  advantage  of  the  ambient  light  detector  compared  to  the  infrared 
detector  is  that  accurate  distance  ranging  can  be  obtained.  It  is  very  likely, 
however,  that  the  distance  ranging  properties  of  the  infrared  detector  can  be 
considerably  improved.  Some  users  might  find  it  of  advantage  that  the 
ambient  light  obstacle  detector  automatically  scans  an  area,  whereas  the 
infrared  detector  employs  only  a  narrow  beam  of  light. 

The  output  system  with  vibrating  plates  is  inferior  to  those  using  pro¬ 
truding  pegs.  It  is  difficult  to  discriminate  with  the  vibratory  sense  between 
the  adjacent  plates,  whereas  no  such  difficulty  arises  with  pegs.  Furthermore, 
the  bistable  mechanical  flip-flop  consumes  less  power  than  a  vibratory 
system. 


192  Man-Machine  Systems 

From  the  experience  gained  during  this  investigation,  it  is  estimated 
that  a  miniaturized  ambient  li^ht  obstacle  detector  could  be  built  with  a 
weight  of  about  2  pounds,  a  volume  of  less  than  0.2  cubic  feet,  and  a  power 
consumption  of  less  than  1  watt.  These  figures  are  of  the  same  order  of 
magnitude  as  those  for  the  Benham-Benjamin  infrared  detector.  Thus  it 
seems  that  the  size  and  weight  of  an  ambient  light  detector  employing  cur¬ 
rently  available  components  could  not  be  made  considerably  smaller  or 
lighter  than  an  infrared  detector  with  comparable  properties. 

REFERENCES 

1.  Benham,  T.  A.,  and  J.  M.  Benjamin.  Electronic  Obstacle  and  Curb  Detectors  for 

the  Blind.  Washington,  D.  C.:  Veterans  Administration,  1960.  (Summary  Re¬ 
port  on  Contract  No.  V1001M-1900.) 

2.  Kallmann,  H.  E.,  “Optar,  a  Method  of  Optical  Automatic  Ranging  as  Applied  to 

a  Guidance  Device  for  the  Blind,”  Proc.  I.R.E.,  Vol.  42  (1954),  pp.  1438-1446. 


PASSIVE  SYSTEMS:  A  MAGNETIC  COMPASS 
AND  STRAIGHT  COURSE  INDICATOR 
FOR  THE  BLIND* 

BERTIL  JACOBSON 

Karolinska  Institutet,  Stockholm,  Sweden 


A  sighted  person  is  considerably  aided  by  a  compass  when  moving  in  sur¬ 
roundings  where  he  is  unable  to  recognize  landmarks.  A  pilot  flying  in  bad 
weather  is  helpless  without  a  compass  or  a  similar  device  if  he  is  unable 
to  see  the  ground.  The  blind  person  is  in  a  similar  situation.  He  may  be 
familiar  with  his  surroundings,  but  he  occasionally  loses  his  way  if  dis¬ 
turbed  or  if  objects  known  to  him  earlier  have  been  changed  or  removed. 
Moreover,  it  is  a  well-known  fact  that  blind  persons  have  great  difficulty 
in  orienting  themselves  when  they  have  to  pass  an  open  place  or  a  field.  To 
explore  whether  a  blind  person  could  be  aided  by  a  compass  under  such 
conditions,  a  laboratory  model  of  a  magnetic  compass  has  been  built  and 
tested. 


DESIGN 

The  compass  consists  of  a  box  with  a  magnetic  bar  operating  as  a  compass 
needle  (Figure  1).  The  bar  is  free  to  rotate  on  a  pivot  between  two  jewelled 
bearings.  The  orientation  of  the  bar  in  relation  to  the  box  is  detected  by  a 
simple  optical  system.  A  narrow  beam  of  light  is  focused  on  a  mirror  at¬ 
tached  to  the  pivot.  The  mirror  can  reflect  light  only  when  the  box  is 
oriented  on  the  desired  course;  the  reflected  light  then  illuminates  a  photo- 
conductive  cadmium  sulphide  cell.  The  photocell  current  is  amplified  and 
made  to  operate  a  relay-like  mechanism  which  causes  a  peg  to  protrude 
from  the  top  of  the  box.  To  dial  a  certain  course  the  cylindrical  compass 
compartment  can  be  rotated  in  relation  to  the  rest  of  the  apparatus.  The 
compass  is  switched  “on”  by  pressing  a  recessed  membrane  which  operates 
a  microswitch. 

*  This  investigation  was  supported  by  a  research  grant  from  the  Swedish  Tech¬ 
nical  Research  Council. 


193 


194  Man-Machine  Systems 

The  compass  consumes  0.17  watt  when  the  tactile  output  relay  is  ener¬ 
gized,  and  0.05  watt  when  the  compass  is  pointing  on  a  wrong  course.  The 
standard  Leclanche  cell  lasts  for  about  35  hours  of  continuous  operation. 
Since  the  compass  is  ordinarily  used  intermittently  a  battery  may  be  ex¬ 
pected  to  last  several  weeks  or  even  months. 


Figure  1  Magnetic  Compass  for  the  Blind 


The  circuit  diagram  is  shown  in  Figure  2.  On  closing  the  microswitch, 
the  current  flows  to  the  lamp  and  the  electronic  circuit.  When  the  resistance 
of  the  cadmium  sulphide  cell  is  reduced  upon  illumination,  the  change 
in  current  is  amplified  by  the  first  transistor,  which  is  followed  by  a  Schmitt 
trigger  circuit  directly  coupled  to  the  output  transducer.  To  permit  opera¬ 
tion  from  45  degrees  to  below  0  degrees  C,  the  cadmium  sulphide  cell  is 
connected  in  parallel  with  a  thermistor. 

APPLICATIONS 

The  compass  is  easy  to  use.  After  the  desired  course  has  been  dialed  the 
correct  course  is  found  by  pointing  the  box  in  different  directions  by  trial 
and  error  (Figure  3).  It  is  easier  to  use  the  compass  and  to  follow  a  given 
direction  if  it  is  swung  from  side  to  side.  If  desired  it  can  be  operated  when 
concealed  in  the  pocket,  provided,  of  course,  that  no  magnetic  objects  are 
carried  in  or  near  the  pocket. 


195 


Magnetic  Compass  and  Course  Indicator 

Without  training  a  person  can  follow  a  course  that  is  correct  within 
plus  or  minus  7  degrees.  The  attainable  accuracy  after  careful  training  has 
not  yet  been  determined.  Each  person  seems  to  have  a  tendency  to  move 
with  a  certain  deviation  from  the  correct  course.  Most  people  tend  to  drift 
to  the  right  when  carrying  the  compass  in  the  left  hand.  For  some  the  drift 


Microswitch 


Tactile 
output  peg 


Armature 

Photoconductive 

cell 

Lamp 

Lens 


Magnet 

^ _ ^ 

Battery 

Figure  2  Cross-Section  of  the  Magnetic  Compass  for  the  Blind 


is  large,  while  others  seem  to  be  able  to  follow  the  correct  course  without 
much  practice. 


DISCUSSION 

Beside  the  magnetic  compass,  the  gyro  and  radio  compasses  are  widely 
used  in  navigation  for  finding  a  desired  direction  or  maintaining  a  straight 
course.  The  gyro  compass  is  more  accurate  and  dependable  than  the  other 
two  principles,  but  its  complexity  seems  to  preclude  its  use  by  the  blind. 
The  radio  compass  suffers  from  the  same  disadvantage  as  the  magnetic 


196  Man-Machine  Systems 

compass  in  that  the  electromagnetic  field  from  the  transmitter  is  distorted 
by  conducting  objects.  The  radio  compass  is  also  limited  by  the  availability 
of  a  suitable  transmitter.  For  these  reasons  the  magnetic  principle  seems  to 
be  the  most  useful  method. 

One  difficulty  in  designing  a  magnetic  compass  is  that  the  most  suitable 


output  systems  are  all  based  on  electromagnetic  action.  The  system  used 
here  is  similar  to  an  electromagnetic  relay.  In  another  model  of  a  compass 
tested,  vibrating  plates  were  employed  as  output  transducers.  To  get  reason¬ 
able  power  efficiency  such  vibrating  systems  usually  involve  permanent 
magnets.  It  is  difficult  to  incorporate  such  output  transducers  in  a  magnetic 
compass  since  the  stray  magnetic  field  from  the  transducer  must  not  affect 
the  compass  bar.  For  this  reason  it  is  not  possible  to  miniaturize  a  magnetic 
compass  with  magnetic  output  systems  below  a  certain  limit.  The  present 
system  is  5  V2  inches  high,  and  could  hardly  be  diminished  by  more  than  1 
inch.  The  other  dimensions  of  the  compass  (3  by  IV2  inches)  could  be 
much  reduced  since  no  attempt  has  been  made  to  miniaturize  the  present 
design. 

An  ordinary  compass  must  be  accurately  leveled  to  prevent  the  needle 
from  coming  into  contact  with  the  floor  of  the  box.  This  is  a  great  dis¬ 
advantage  to  a  blind  person.  The  compass  described  here  may  be  operated 
even  when  slightly  tilted,  since  the  bar  is  pivoted  on  two  jewelled  bearings. 

The  major  disadvantage  of  the  magnetic  compass  principle  is  that  ob¬ 
jects  containing  iron  distort  the  magnetic  field  of  the  earth.  Thus  a  car 
parked  at  the  curb  causes  an  error,  as  do  to  a  much  greater  extent,  large 
buildings  with  a  steel  frame  or  a  reinforcing  steel  mesh.  In  general  a  dis¬ 
tortion  of  practical  import  is  obtained  at  a  distance  approximately  equal 
to  the  size  of  the  iron  object. 


Magnetic  Compass  and  Course  Indicator  1 97 

We  consider  that  the  compass  can  be  a  useful  aid  for  a  blind  person 
moving  across  open  spaces.  No  other  technical  method  hitherto  developed 
gives  the  same  aid.  The  compass  might  also  prove  of  value  in  urban  areas. 
It  is  then  necessary  to  train  the  user  in  the  detection  of  false  readings  due 
to  iron  objects.  When  moving  along  a  straight  thoroughfare  it  is  possible 
to  detect  the  presence  of  such  objects  by  observing  the  deflection  of  the 
magnet,  and  the  pedestrian  can  then  compensate  for  them. 

It  might  be  of  advantage  to  attach  the  compass  to  a  cane.  It  would  then 
be  possible,  during  the  cane  movements  from  side  to  side,  to  have  a  simul¬ 
taneous  indication  of  the  straight  course  by  the  compass. 


PASSIVE  SYSTEMS:  STRAIGHT  LINE 


TRAVEL  AID  FOR  THE  BLIND* 

JAMES  C.  SWAIL 

National  Research  Council  of  Canada,  Ottawa,  Canada 


In  the  following  paper  a  simple  orientation  device  is  described  which  may 
be  used  by  a  blind  person  as  a  straight  line  travel  aid. 

Since  World  War  II  a  number  of  attempts  have  been  made  to  develop  an 
electronic  travel  aid  for  the  blind.  For  the  most  part  these  have  employed  re¬ 
flected  supersonic  or  optical  energy  to  indicate  the  presence  of  obstacles  in 
the  path  of  the  user.  If  successful  these  devices  could  provide  the  user  with 
information  as  to  his  immediate  surroundings.  None  of  them,  however,  would 
be  capable  of  giving  him  information  about  his  direction  of  travel.  These  may 
then  be  likened  to  a  ship’s  radar  which  does  not  obviate  the  need  for  a 
ship’s  compass. 

Blind  persons  often  find  it  difficult  to  follow  a  straight  line  and  to 
maintain  their  sense  of  direction  while  crossing  an  open  area.  For  example, 
difficulty  may  be  encountered  at  the  wide  entrance  to  a  parking  lot  or  service 
station,  a  wide  intersection,  an  open  field,  or  a  very  wide  sidewalk.  The 
device  can  also  be  very  useful  in  identifying  which  of  several  streets  to  take 
at  an  intersection  where  a  number  of  streets  converge.  Confusion  under 
these  circumstances  is  particularly  likely  to  occur  under  conditions  of  loud 
noise  or  after  a  heavy  snow. 

Recently  it  came  to  the  author’s  attention  that  a  number  of  blind 
people  have  been  experimenting  with  the  use  of  small  portable  transistor 
receivers  as  straight  line  travel  aids.  It  is  possible  to  use  these  receivers 
because  of  the  extreme  directivity  of  the  ferrite  rod  antennas  in  some  of 
these  sets. 

One  of  these  sets  was  given  extensive  trials  for  this  purpose.  However, 
it  became  apparent  that  this  type  of  receiver,  although  giving  some  infor¬ 
mation,  leaves  much  to  be  desired.  The  automatic  gain  control  action  in 


*  The  author  wishes  to  thank  Mr.  E.  J.  Doyle  for  his  technical  assistance  and 
many  helpful  suggestions  during  the  course  of  this  work. 


199 


200  Man-Machine  Systems 

these  receivers  masks  the  null  point  except  under  ideal  conditions.  To 
perform  satisfactorily  there  must  be  low  ambient  acoustic  and  electrical 
noise,  a  rather  critical  value  of  signal  level  from  the  broadcasting  station, 
and  appreciable  time  in  which  to  perform  the  operation.  The  varying 
modulation  content  can  be  be  very  confusing  to  the  user,  that  is,  pauses 
in  the  transmitted  speech  or  music  may  give  a  false  impression  of  a  direc¬ 
tional  null.  Another  problem  arises  if  the  user  becomes  momentarily  inter¬ 
ested  in  the  program,  thus  distracting  his  attention.  This  could  be  very 
dangerous  in  traffic  conditions. 

All  of  this  adds  up  to  the  fact  that  under  some  rural  conditions  a  trans¬ 
istor  receiver  can  be  of  some  use  as  a  travel  aid,  but  in  urban  situations  an 
unmodified  set  may  actually  be  more  dangerous  than  helpful.  At  the  out¬ 
set  it  was  decided  that  if  an  orientation  device  of  this  nature  were  to  be 
acceptable,  it  should  be  capable  of  giving  the  required  information  re¬ 
liably;  it  should  be  lightweight,  inconspicuous,  inexpensive,  and  small 
enough  to  fit  conveniently  in  the  owner’s  pocket  or  purse  when  not  required. 
It  was  decided,  therefore,  to  modify  a  standard,  low-cost  transistor  re¬ 
ceiver;  this  approach  would  have  the  advantage  of  providing  the  blind  user 
with  a  radio  when  the  device  was  not  being  used  in  its  orientation  function. 

From  work  done  in  the  past  on  the  design  of  test  equipment  for  the 
blind,  it  is  clear  that  under  conditions  of  noise  it  is  much  simpler  to  detect 
a  change  in  pitch  of  an  audible  tone  than  a  change  in  amplitude.  Also,  be¬ 
cause  of  the  varying  character  of  the  modulation  in  the  broadcast  signal, 
the  carrier  should  be  used  as  the  reference. 

For  these  reasons  a  simple  circuit  was  devised  which,  when  connected 
to  the  receiver,  produces  in  the  output  a  series  of  sharp  pulses.  In  the  ab¬ 
sence  of  a  received  carrier  these  pulses  are  at  a  very  low  repetition  rate, 
but  as  the  strength  of  the  received  carrier  increases  and  pulse  repetition  rate 
increases,  a  high  pitched  squeal  is  produced  in  the  presence  of  a  strong  signal. 
Once  a  strong  signal  is  tuned  in  and  the  pulse  generator  put  into  operation  the 
set  is  then  rotated  for  minimum  pitch.  This  is  the  previously  mentioned 
null,  which  is  now  much  easier  to  find.  With  a  reasonably  strong  signal, 
and  using  a  receiver  whose  design  permits  of  a  good  null,  a  discrimination 
of  about  five  degrees  or  less  is  possible  with  this  system  (see  Figures  1 
and  2). 

It  is  very  important  that  care  be  taken  in  the  selection  of  the  receiver 
to  be  modified.  Because  of  the  varying  designs  of  ferrite  antennas  en¬ 
countered  in  different  makes  and  in  differing  circuit  board  layouts,  some 
receivers  work  much  more  satisfactorily  than  others  and  give  much 
sharper  nulls. 


Straight  Line  Travel  Aid 


201 


—  9  V 


notes: 

Tl  -  TRANSFORMER:  HAMMONO  4M42M 
PRIMARY  IMP  1200X1  C  T 
SECONDARY  IMP.  NOT  USED 


Figure  1  Modified  AM  Transistor  Radio  for  a  Straight  Line  Travel  Aid 


In  the  first  sets  modified  at  the  National  Research  Council,  the  extra 
circuitry  was  housed  in  an  enlarged  earphone  pouch  attached  to  the  carry¬ 
ing  strap.  A  switch  protruded  from  one  end  of  this  pouch  and  was  connected 
by  a  flexible  cable  to  the  receiver.  In  later  models,  however,  a  smaller  bat¬ 
tery  was  substituted  in  the  receiver,  and  the  extra  electronics  installed  in 
the  remaining  space.  Although  this  means  a  shorter  battery  life,  it  produces 
a  much  more  compact  and  rugged  set.  The  first  sets  proved  too  fragile  in 
field  use.  The  cables  were  often  accidentally  pulled  from  their  moorings 
(see  Figure  3). 

In  order  to  establish  the  usefulness  of  the  device  a  number  were  dis¬ 
tributed  to  blind  people  in  Ottawa  and  Montreal.  The  results  of  these  field 
tests  were  most  encouraging.  Although  the  travel  aid  does  not  enable  a 
blind  traveler  to  go  anywhere  or  do  anything  he  would  not  do  without  the 
device,  having  it  does  increase  his  confidence  and  speed.  This  is  particularly 
true  under  snow  conditions. 


202 


Man-Machine  Systems 


Figure  3  Appearance  of  the  Completed  Straight  Line  Travel  Aid 


Two  difficulties  have  shown  up,  both  of  which  fortunately  have  simple 
solutions.  First,  the  sound  from  the  loudspeaker  made  some  of  the  users 
feel  conspicuous  while  walking  along  quiet  streets,  whereas  it  was  com¬ 
pletely  inaudible  under  very  noisy  conditions.  Second,  the  device  requires  a 
hand,  and  when  a  white  cane  is  used  at  the  same  time,  there  is  no  free 
hand  for  carrying  parcels,  etc. 

The  first  problem  is  overcome  simply  by  using  the  earphone  supplied 
with  such  sets.  If  the  earphone  is  equipped  with  a  suitable  clip,  it  may  be 
attached  to  the  shirt  collar,  which  is  close  enough  to  the  ear  to  be  audible 


203 


Straight  Line  Travel  Aid 

under  any  conditions  so  far  encountered,  and  yet  not  sufficiently  loud  to 
attract  attention.  The  earphones  should  not  be  worn  in  the  usual  way  since 
this  would  interfere  with  the  normal  function  of  the  ear;  a  blind  person 
obtains  most  of  his  information  about  his  surroundings  by  sound  cues. 

The  second  problem  was  solved  through  the  use  of  a  simple  clip,  which 
permitted  the  set  to  be  attached  to  belt  or  briefcase,  etc.,  and  yet  left  it 
free  to  be  rotated  to  the  desired  direction.  This  also  solved  another  problem 
encountered  by  some,  namely  the  difficulty  in  keeping  the  hand  in  a  stable 
position  relative  to  the  desired  direction  of  travel.  It  was  suggested  that  a 
set  kept  in  the  pocket  and  equipped  with  a  rotating  antenna  would  solve 
this.  Unfortunately,  this  seems  too  costly  to  manufacture. 

At  the  suggestion  of  one  of  the  users,  a  tactile  output  was  investigated. 
It  was  discovered,  however,  that  it  would  consume  far  too  much  power,  and 
that  the  sense  of  touch  is  not  sufficiently  great  to  distinguish  any  simple  form 
of  output.  It  was  felt  that  if  the  device  became  complex  much  of  its  potential 
would  be  lost  no  matter  how  interesting  a  gadget  it  might  be  from  a  technical 
point  of  view. 

Under  some  urban  conditions  where  large  steel  structures  are  en¬ 
countered  the  device  must  be  used  with  caution,  as  these  buildings  can 
distort  the  received  pattern  drastically.  In  one  particular  instance  the  ap¬ 
parent  direction  of  reception  shifted  by  90  degrees  within  a  space  of  25  feet. 
This  is  obviously  a  severe  hazard.  It  is  recommended,  therefore,  that  such 
devices  be  used  on  urban  routes  after  the  route  has  been  carefully  checked 
over  once.  In  suburban  or  rural  situations  this  would  probably  not  be 
necessary.  This  limitation  is  probably  not  too  serious  anyway,  since  most 
blind  persons  travel  the  same  route  daily  and,  as  a  matter  of  fact,  these  dis¬ 
tortions  can  serve  a  very  useful  function  as  trailmarkers. 

In  conclusion,  it  is  felt  that  this  device,  if  properly  used,  can  increase 
the  self-confidence  of  blind  people  who  travel  alone  and  assist  them  to  travel 
more  quickly.  As  it  is  small,  it  may  be  conveniently  carried  on  the  person 
and  is  available  when  required. 

It  is  interesting  to  note  that  a  number  of  sighted  people  have  expressed 
a  desire  for  this  instrument,  to  be  used  as  a  combination  radio  and  direction 
finder  while  hunting  or  fishing. 

The  experience  gained  in  these  tests  has  shown  conclusively  that  the 
device  should  not  be  used  without  adequate  instruction  in  the  field.  We 
found  that  where  sets  had  been  handed  out  with  only  verbal  or  written 
instructions,  problems  arose  which  did  not  occur  where  a  personal  demon¬ 
stration  had  been  given. 

Although  not  firmly  established,  it  would  appear  that  the  set  could  be 


204  Man-Machine  Systems 

produced  for  less  than  $40,  about  twice  the  price  of  the  unmodified  radio. 
The  added  electronics  in  no  way  hampers  the  set’s  normal  function. 

CIRCUIT  DESCRIPTION 

The  accompanying  circuit  diagrams  show  two  alternative  circuits  added 
to  a  typical  six-transistor  receiver.  These  circuits  are  basically  similar  and 
the  choice  of  which  is  appropriate  depends  on  which  is  the  more  economical 
to  manufacture  and  which  will  fit  into  the  available  space  more  readily. 

In  both  a  germanium  diode  has  its  cathode  connected  to  the  secondary 
of  the  last  i.f.  transformer;  thus  a  negative  dc  voltage  will  appear  between 
its  anode  and  ground.  This  voltage  is  proportional  to  the  strength  of  the 
incoming  signal. 

This  is  applied  to  the  base  of  a  p-n-p  transistor.  This  transistor  in  turn 
serves  as  one  of  the  time  constant  elements  of  the  blocking  oscillator  or 
multivibrator.  Thus  as  the  level  of  this  voltage  increases,  so  does  the  current 
flowing  in  the  base-emitter  junction.  This  lowers  its  collector  impedance, 
shortens  the  time  constant,  and  raises  the  frequency  of  oscillation. 

The  blocking  oscillator  or  multivibrator  circuits  are  straightforward  and 
need  no  comment,  except  to  say  that  their  output  is  coupled  back  to  the 
top  of  the  gain  control  through  a  capacitor  whose  value  is  chosen  so  that  it 
will  serve  to  couple  sufficient  energy  without  bypassing  any  of  the  audio 
signal  when  the  set  is  being  used  normally. 


SECTION  II 


READING  MACHINES 

CHAIRMAN:  HOWARD  FREIBERGER 
Veterans  Administration,  New  York,  New  York 


SOME  DESIGN  CRITERIA 
FOR  A  BLIND  READING  AID* 

MAXWELL  B.  CLOWES 

National  Physical  Laboratory,  Middlesex,  England 


INTRODUCTION 

The  current  interest  in  devices  to  enable  the  blind  to  read  stems  in  part  at 
least  from  recent  advances  in  reading  machines  for  commercial  application. 
The  same  stimulus  also  provides  a  choice  not  available  when  the  last  im¬ 
portant  efforts  were  made  (5,  9)  when  recognition  of  printed  letters  was 
essentially  the  task  of  the  human  subject.  In  consequence,  research  on  read¬ 
ing  methods  for  the  blind  is  now  divided  into  two  main  schools : 

(1)  Those  who  consider  that  mechanical  letter  recognition  is  essen¬ 
tial. 

(2)  Those  who  continue  to  believe  that  such  a  step  can  and  should 

be  avoided  through  the  use  of  human  skill  in  the  recognition  of 

patterns. 

The  difference  is  essentially  that  of  providing  a  reading  machine  or  a 
reading  aid.  In  practice  the  distinction  is  not  clear  cut,  since  it  is  conceded 

*  The  author  wishes  to  acknowledge  many  useful  discussions  with  Dr.  A.  M. 
Uttley  and  Mr.  P.  W.  Nye.  This  paper  has  been  written  as  part  of  the  research 

program  sponsored  by  St.  Dunstan’s  at  the  Autonomies  Division  of  the  National 

Physical  Laboratory,  and  is  published  by  permission  of  the  Director  of  the  Laboratory. 


205 


206  Man-Machine  Systems 

that  the  machine  would  not  be  fully  automatic  (17),  and  the  reading  aid 
will  almost  certainly  need  to  be  rather  more  sophisticated  than  the  existing 
‘simplex’  systems.  It  is  the  purpose  of  this  paper  to  examine  some  of  the 
problems  of  the  reading  machine,  and  to  advance  some  suggestions  as  to  the 
form  that  a  successful  reading  aid  might  take. 

READING  AND  DATA  PROCESSING 

The  psychological  processes  which  intervene  between  the  patterns  appearing 
on  the  printed  page  and  the  ideas  and  images  conveyed  to  the  sighted 
reader  (or  the  corresponding  auditory  process)  are  not  yet  understood. 
Miller  (16)  has  indicated  some  of  the  levels  in  the  evidently  hierarchical 
organization  of  language.  The  first  three  levels  he  cites  appear  in  Figure 
1  (Levels  3,  4,  and  5).  They  are  preceded  by  a  level  representing  the  physi¬ 
cal  object  as  seen  by  the  scanner  (Level  1),  and  a  so-called  ‘feature’  stage 
(Level  2).  Miller  did  not  provide  any  precise  routines  for  processing  data 
at  one  level  so  as  to  generate  information  for  the  next.  The  character  recog¬ 
nition  literature,  however,  abounds  with  schemes. 

The  two-dimensional  pattern,  while  usually  scanned  serially,  is  almost 
always  operated  upon  as  if  available  simultaneously.  The  first  ‘chunking’ 
operation  (to  extract  features  of  the  pattern  which  are  common  to  a  num¬ 
ber  of  patterns  in  the  set  to  be  recognized)  has  been  described  in  a  number 
of  simulation  programs  (6,  10,  13).  In  some  cases,  however,  especially 
where  only  a  single  type  style  is  to  be  read,  this  step  may  be  omitted  (15, 
18).  In  either  case  it  is  usually  permissible  to  regard  the  function  of  the 
A-units  (Association  units)  as  that  of  providing  comparisons  between  the 
letter  and  two-dimensional  stored  templates  which  resemble  parts  of  letters 
or  whole  letters.  Among  the  exceptions  to  this  are  the  processes  involving 
the  use  of  moments  to  describe  a  character  (1,  14).  Here  it  is  not  generally 
possible  to  identify  a  particular  moment  or  set  of  moments  with  a  specified 
feature. 

The  concatenation  of  letters  into  words  has  received  considerable  atten¬ 
tion  for  the  purposes  of  machine  translation  and  information  retrieval,  but 
always  through  the  application  of  digital  computers.  A  notable  exception  is 
that  of  the  address  reader  designed  by  Farrington  Manufacturing  Co.  for 
the  U.  S.  Post  Office.  For  a  vocabulary  consisting  of  the  names  of  states  and 
principal  cities,  this  device  achieves  recognition  by  the  relatively  simple 
expedient  of  detecting  ascenders,  descenders,  and  x-height  letters  with  three 
horizontals  ( a ,  s,  e,  and  z) . 

The  intermediate  stage  of  syllable  formation  is  usually  omitted  in  char- 


Design  Criteria  207 

acter  recognition  (though  not  in  speech  recognition)  although  a  comparable 
process  is  involved  in  the  use  of  digram  and  trigram  statistics  to  correct 
single  letter  errors  (2). 

The  immediate  purpose  of  this  paper  is  not  to  classify  character  recog¬ 
nition  studies,  but  rather  to  illustrate  the  reading  aid/reading  machine 
choice.  The  problem  for  us  is  to  decide  how  many  operations  should  be 


Figure  1  The  Levels  of  Language.  After  George  A.  Miller  (16). 


208  Man-Machine  Systems 

provided  by  mechanical  means,  and  how  many  should  employ  the  blind 
users’  faculties.  Essentially  we  have  to  decide  where  the  man-machine  inter¬ 
face  should  be  placed.  In  the  existing  ‘simplex’  systems,  it  lies  immediately 
after  the  scanner,  while  current  research  at  the  Haskins  Laboratories  (8) 
places  it  at  the  word  level.  The  Haskins  program  is  not  concerned,  how¬ 
ever,  with  the  development  of  a  “cheap,  transportable”  device,  for  which 
it  is  clear  the  interface  cannot  lie  higher  than  the  letter  level.  Mauch  (17) 
is  developing  a  system  based  upon  an  interface  at  this  level;  what  are  the 
difficulties? 

READING  MACHINES  FOR  THE  BLIND 

An  important  distinction  between  the  reading  requirements  of  commerce 
and  research  and  those  of  the  blind  lies  in  the  variety  of  data  to  which  the 
blind  desire  access.  This  includes  variations  in  size,  style,  format,  and 
quality,  most  of  which  are  usually  eliminated  or  at  least  under  control  in 
commercial  applications.  Mauch  (17)  has  estimated  that  six  styles  cover 
75  to  80  percent  of  letter  press  and  imprinted  material,  and  that  size  vari¬ 
ations  will  be  within  the  range  8  to  24  point  (i.e.,  a  4  to  1  ratio).  While 
computer  programs  have  been  demonstrated  for  isolated  characters  having 
comparable  if  not  greater  variability  (10,  13)  a  machine  having  this  degree 
of  versatility  has  yet  to  be  demonstrated. 

The  approach  to  a  multifont  reading  need  not  be  so  drastic,  however,  as 
is  implied  by  the  studies  cited  above  (10,  13).  The  latter  are  relevant  to 
situations  in  which  the  print  parameters  are  varying  continuously,  and 
where  it  is  necessary  to  obtain  an  ‘invariant’  recognition  technique.  The 
problem  is  much  simpler  if  the  style  and  size  changes  can  be  predicted  in 
advance,  whereupon  the  necessary  changes  of  logic,  templates,  magnifica¬ 
tions,  etc.,  can  be  made  prior  to  attempts  at  recognition.  A  partial  multifont 
reader  may  be  a  practicable  proposition,  therefore,  for  those  circumstances 
in  which  a  single  style  is  to  be  read  for  long  periods. 

So  far  we  have  discussed  the  recognition  problem  in  general  terms.  We 
may  enquire  more  specifically  as  to  the  form  that  such  a  system  might  take, 
given  the  additional  constraints  that  the  machine  be  “cheap  and  trans¬ 
portable.”  Among  the  technically  less  involved  approaches  are  the  template 
matching  and  peephole  methods  (19),  which  both  rely  upon  evaluating 
‘best  fit’  between  the  unknown  letter  and  a  set  of  shapes  in  a  stored  vocab¬ 
ulary.  It  is  well  known  that  such  systems  require  accurate  registration  of 
the  character  with  respect  to  stored  templates.  Registration  is  usually  ef¬ 
fected  by  ‘sensing’  the  horizontal  and  vertical  extremities  of  the  unknown 


Design  Criteria  209 

pattern.  Failure  to  achieve  correct  positioning  can  result  in  an  incorrect 
identification,  as  illustrated  in  Figure  2,  where  the  unknown  ‘M’  matches  the 
stored  ‘V’  better  than  it  matches  the  stored  ‘M.’  This  type  of  error  is  par¬ 
ticularly  troublesome  with  print  of  poorer  quality  in  which  the  extremities 
of  a  character  are  often  degraded.  It  may  be  overcome  by  the  use  of  more 
sophisticated  criteria  in  defining  ‘extremity,’  accompanied  by  duplication 


STORED  TEMPLATES 


Figure  2  An  Incorrect  Identification  Due  to  Failure  to  Achieve  Correct 

Positioning 

of  the  templates  in  a  number  of  positions.  Such  techniques  have  been  found 
necessary  in  the  design  of  commercial  devices  reading  numerical  data  from 
imprinting  machines.  Here,  however,  the  tolerated  error  and  rejection  rates 
are  very  low;  in  applications  for  the  blind  user  these  factors  may  not  be  so 
important. 

The  problems  of  registration  shade  imperceptibly  into  those  of  segmen¬ 
tation.  The  two-dimensional  matching  process  has  to  distinguish  between  a 
single  character  and  two  halves  of  adjacent  characters.  The  processes  of 
locating  the  horizontal  extremities  of  a  character  in  order  to  do  this,  when 
the  character  is  embedded  in  close  proximity  with  other  letters,  is  acknowl¬ 
edged  as  being  a  fundamental  difficulty  in  print  recognition  (12). 

These  are  some  of  the  difficulties  to  be  overcome  in  a  successful  recog¬ 
nition  machine.  The  particular  use  to  which  the  system  is  to  be  put — 
feeding  data  to  a  blind  person  rather  than  to  a  computer — may  have  some 
influence  upon  the  resolution  of  these  difficulties.  The  effects  of  single 


210  Man-Machine  Systems 

letter  errors  may  not  be  so  important,  and  manual  registration  techniques 
may  be  practicable.  The  use  of  a  sighted  helper  to  provide  information  about 
size  and  style  changes  may  not  be  acceptable  to  the  blind  person.  An  im¬ 
portant  aspect  of  an  effective  personal  reading  device  is  the  facility  it 
affords  for  maintaining  the  privacy  of  the  blind  user.  It  is  undoubtedly 
true  that  a  workable  reading  machine  could  be  devised  for  use  by  the  blind. 
Its  cost,  however,  is  bound  to  place  it  beyond  the  reach  of  many  potential 
users  when  compared  with  the  cost  of  a  paid  companion. 

READING  AIDS  FOR  THE  BLIND 

The  reading  machine  aims  to  supply  the  first  two  stages  of  the  hypothetical 
data  processing  scheme  of  Figure  1  in  terms  of  hardware.  Exponents  of 
this  approach  argue  that  this  is  the  inevitable  conclusion  to  be  drawn  from 
the  many  failures  to  improve  reading  performance  by  modifying  the  mode 
of  display.  The  different  versions  of  the  Optophone,  the  RCA  reader,  the 
Argyle  reader,  and  Mauch’s  work  on  speech-like  displays,  all  seem  to  argue 
for  at  least  one  further  stage  of  data  processing.  We  have  examined  the 
implications  of  an  interface  at  Level  3.  Can  any  improvement  be  effected 
by  less  sophisticated  mechanisms?  In  our  preliminary  studies  of  possible 
improvements  to  reading  aids,  attempts  were  made  to  utilize  the  spatial 
capacities  of  other  sensory  modalities  (11).  This  would  seem  to  be  the 
obvious  way  to  achieve  the  required  ‘invariances’  in  respect  of  position, 
style,  etc. 

In  a  simulation  study  involving  the  tactual  mode,  we  measured  speed 
and  accuracy  with  large  embossed  characters.  Over  a  short  series  of  five 
half-hour  training  and  test  sessions,  we  were  able  to  demonstrate  accuracies 
of  90  percent  or  better  with  a  limited  10-symbol  alphabet.  The  symbols  were 
tracked  singly  past  an  aperture,  at  10  second  intervals,  at  a  rate  of  30  words 
per  minute  (wpm).  In  a  second  experiment  the  material  was  in  the  form  of 
5-letter  nonsense  words;  under  self-paced  conditions  these  were  read  at 
about  3  wpm.  This  would  be  higher  of  course  for  meaningful  material,  per¬ 
haps  6  wpm,  which  would  compare  favorably  with  the  performance  ob¬ 
tained  with  Optophone  systems  when  one  realizes  the  very  limited  amount 
of  training  involved.  It  is  proposed  to  carry  out  more  extensive  tests  on  this 
embossed  material.  The  technical  problems  involved  in  developing  a  two- 
dimensional  display  of  this  kind  are  formidable:  embossing  devices  on  the 
Visagraph  model  are  not  likely  to  prove  acceptable,  and  a  two-dimensional, 
solenoid-operated  pin  matrix  is  equally  difficult.  Systems  to  enable  touch 
reading  of  holes  punched  in  tape  or  cards  have  been  considered,  and  we 


Design  Criteria  211 

have  noted  with  great  interest  the  work  of  Professor  Mann  and  his  col¬ 
leagues  at  the  Massachusetts  Institute  of  Technology. 

A  second  study,  on  the  exploitation  of  directional  sensitivity  in  hearing, 
has  to  date  proved  inconclusive.  In  essence,  this  attempted  to  display  sounds 
from  different  parts  of  the  letter  at  corresponding  points  in  the  inter-aural 
space.  It  seems  likely  that  considerable  skill  may  be  required  to  learn  to 
perceive  sounds  structured  in  this  way. 

With  the  continued  desire  of  making  the  geometry  of  the  character  more 
readily  perceived  by  the  subject,  we  have  turned  our  attention  to  more  ab¬ 
stract  techniques,  still  with  a  view  to  an  interface  below  Level  3. 

Miller  (16)  has  pointed  out  that  discrete  stimuli  can  convey  large 
quantities  of  information  to  a  human  observer,  but  only  if  the  stimuli  con¬ 
form  to  certain  criteria: 


(1)  They  should  vary,  one  from  another,  in  as  many  attributes  as 
possible. 

(2)  A  single  attribute  should  have  only  a  small  number  of  permitted 
values  (e.g.,  not  more  than  four). 


Two  ‘alphabets’  constructed  in  accordance  with,  and  at  variance  with, 
these  principles  are  illustrated  in  Figures  3  and  4.  A  simple  phrase  is  coded 
in  the  two  ‘alphabets.’  The  attribute  chosen  for  the  unidimensional  code 
(Figure  3)  was  orientation  (6-degree  steps)  as  being  the  only  practicable 
variable  for  illustration.  The  more  complex  code  (Figure  4)  has  symbols 
differing  in  shape  (triangle  or  circle)  and  size  (large  or  small).  The 


i 


(mens  I  O  N  A  L 


C  O 


Figure  3  A  Unidimensional  Code 


212 


Man-Machine  Systems 


FIVE 


DIMENSIONAL 


A  •  I  I'.iillllii. 

CODE 

Figure  4  A  Multidimensional  Code 


‘texture’  of  a  shape  has  three  additional  attributes:  heavy  or  light  bars, 
oriented  vertically  or  horizontally,  appearing  on  a  clear  or  dotted  back¬ 
ground.  The  difficulties  of  discrimination  with  a  single-attribute  code  are 
brought  out  in  this  illustration. 

The  application  of  these  observations  to  the  design  of  a  blind  reading 
aid  begins  from  the  premise  that  any  audible  output  should  conform  to 
these  principles.  The  precise  method  by  which  print  is  mapped  in  such 
multidimensional  sounds  may  well  be  the  crucial  factor  in  enabling  the 
subject  to  perceive  the  geometry  of  the  characters.  It  would  seem  to  imply 
an  identity  relation  between  ‘topological’  attributes  in  the  printed  pattern 
and  its  auditory  transform.  That  such  an  identification  is  important  is  con¬ 
sistent  with  the  results  from  the  multidimensional  displays  used  by  Bed- 
does,  Belyea,  and  Gibson  (4).  Their  results  for  sighted  subjects  responding 
to  a  12-symbol  vocabulary  do  not  differ  significantly  from  results  obtained 
by  Ellis  (11)  for  a  10-symbol  Optophone  vocabulary.  These  workers  have 
now  gone  on  to  use  a  spelled  speech  output  (3 ) . 

A  set  of  attributes  of  the  printed  character  which  may  prove  useful  in 
this  connection  are  those  currently  employed  to  characterize  handwritten 
numerals  in  the  Character  Recognition  studies  of  the  National  Physical 
Laboratories  (NPL).  By  an  autocorrelation  procedure  (7),  short  lengths 
of  the  continuous  lines  of  the  character  are  classified  as  straight/not  straight. 
This  is  a  local  operation  having  much  in  common  with  the  feature  analyses 
cited  earlier  (6,  10,  13).  The  mean  orientation,  length,  and  position 


Design  Cri teria  213 

within  the  character  (top/bottom)  are  also  determined,  and  this  provides 
an  adequate  description  of  many  styles  of  numeral.  The  choice  of  possible 
auditory  attributes  upon  which  to  map  these  geometrical  attributes  is  much 
more  difficult,  and  will  require  much  more  research.  The  utility  of  this  ap¬ 
proach  lies  in  its  avoidance  of  the  difficulties  of  registration  and  segmenta¬ 
tion,  and  style,  size,  and  quality  changes.  These  advantages  must,  however, 
be  weighed  against  the  learning  problems  undoubtedly  associated  with  a 
Level  2  interface,  and  the  technological  problems  of  deriving  this  descrip¬ 
tion  of  the  character. 

CONCLUSIONS 

In  summary,  the  joint  NPL/St.  Dunstan’s  program  is  concentrating  on  de¬ 
vices  which  stop  short  of  mechanical  recognition  of  print.  Two  types  of  de¬ 
vice  are  under  consideration,  both  based  upon  the  presentation  of  a  clearer 
display  of  the  geometry  of  the  printed  character.  This  program  is  in  many 
respects  complementary  to  those  being  pursued  in  the  U.S.A.  and  as  such 
widens  the  front  along  which  so  much  praiseworthy  effort  is  now  being 
deployed. 

REFERENCES 

1.  Alt,  F.  L.,  “Digital  Pattern  Recognition  by  Moments,”  J.  Assn.  Computing 

Machinery,  Vol.  9,  No.  2  (1962),  p.  240. 

2.  Andrews,  M.  C.,  “Multifont  Print  Recognition,”  in  Fischer,  Pollack,  Radack,  and 

Stevens  (eds.)  Optical  Character  Recognition.  Washington,  D.  C.:  Spartan 
Books,  1962. 

3.  Beddoes,  M.  P.  Private  communication,  1962. 

4.  Beddoes,  M.  P.,  E.  Belyea,  and  W.  C.  Gibson,  “A  Reading  Machine  for  the 

Blind,”  Nature,  Vol.  190,  No.  4779  (1961),  p.  874. 

5.  Beurle,  R.  L.  Electronic  Aids  for  Blind  People.  London:  St.  Dunstan’s,  1952. 

6.  Bomba,  J.  S.,  “Alpha-Numeric  Character  Recognition,”  in  Proceedings  Eastern 

Joint  Computer  Conference,  1959,  p.  218. 

7.  Clowes,  M.  B.,  and  J.  R.  Parks,  “A  New  Technique  in  Automatic  Character 

Recognition,”  Computer  J.,  Vol.  4,  No.  2  ( 1961 ),  p.  121. 

8.  Cooper,  F.  S.  Toward  a  High  Performance  Reading  Machine  for  the  Blind. 

New  York:  Haskins  Laboratories,  1961. 

9.  Cooper,  F.  S.  and  P.  A.  Zahl.  Report  to  the  Committee  on  Sensory  Devices. 

Washington,  D.  C.:  National  Academy  of  Sciences,  June  1947. 

10.  Doyle,  W.  Recognition  of  Sloppy,  Hand-Printed  Characters.  Cambridge:  Mas¬ 

sachusetts  Institute  of  Technology,  1959. 

11.  Ellis,  K.,  “Some  Experiments  on  Reading  Aids  for  the  Blind,”  J.  Brit.  I.R.E., 

Vol.  25,  No.  2  (1963),  p.  188. 

12.  Fitzmaurice,  J.  A.,  E.  Sabbagh,  and  W.  Elliott  in  Fischer,  Pollack,  Radack,  and 

Stevens  (eds.)  Optical  Character  Recognition.  Washington,  D.  C.:  Spartan 
Books,  1962 


214  Man-Machine  Systems 

13.  Grimsdale,  R.  L.,  F.  H.  Sumner,  C.  J.  Tunis,  and  T.  Kilburn,  “A  System  for  the 

Automatic  Recognition  of  Patterns,”  Proc.  I.E.E.,  Vol.  106B,  No.  26  (1959), 

p.  210 

14.  Guiliano,  V.  E.,  P.  E.  Jones,  G.  E.  Kimball,  R.  F.  Meyer,  and  B.  A.  Stein, 

“Automatic  Pattern  Recognition  by  a  Gestalt  Method,”  Information  and 
Control,  Vol.  4,  No.  4  ( 1961 ),  p.  332. 

15.  Merry,  I.  W.,  and  G.  O.  Norrie,  “Character  Quality  and  Scanner  Organization,” 

Computer  J.,  Vol.  4,  No.  2  (1961),  p.  137. 

16.  Miller,  G.  A.,  “Human  Memory  and  the  Storage  of  Information,”  I.R.E.  Trans. 

Information  Theory,  Vol.  2,  No.  3  (1956),  p.  129. 

17.  Nolte,  W.  G.,  and  H.  A.  Mauch.  Report  to  the  Prosthetic  and  Sensory  Aids 

Service.  New  York:  Veterans  Administration,  1961. 

18.  Rabinow,  J.,  “Developments  in  Character  Recognition  Machines  at  Rabinow 

Engineering  Company,”  in  Fischer,  Pollack,  Radack,  and  Stevens  (eds.) 
Optical  Character  Recognition.  Washington,  D.  C.:  Spartan  Books,  1962. 

19.  Stevens,  M.  E.  Automatic  Character  Recognition.  Washington,  D.  C.:  National 

Bureau  of  Standards,  May  1961. 


REVIEW  OF 

THE  MAJOR  FUNCTIONAL  CONCEPTS  OF 
READING  MACHINES 

HANS  A.  MAUCH 

Mauch  Laboratories,  Inc.,  Dayton,  Ohio 


Before  entering  into  technical  details,  I  would  rather  start  by  asking  the 
question:  What  is  a  reading  machine  for  the  blind?  The  answer  depends  on 
who  is  talking.  For  someone  interested  in  the  rehabilitation  of  the  blind 
person,  the  machine  is  restoring  a  limited  function  of  the  human  eye,  and 
is  in  that  sense  a  prosthesis.  If  you  ask  the  same  question  of  an  engineer, 
however,  he  will  give  you  a  different  answer;  for  him  it  is  an  information¬ 
handling  device,  which  as  usual  has  an  input  part,  an  output  part,  and  one 
or  more  “black  boxes”  in  between.  The  input  part  receives  information; 
the  output  part  gives  information;  and  they  usually  differ  from  one  another 
in  what  is  defined  as  the  “code.”  So  in  a  sense  these  machines  are  “code 
converters.”  They  take  something  which  the  human  eye  sees  (a  visual 
code,  i.e.,  an  alphanumeric  graphic  system  of  symbols),  and  they  are  sup¬ 
posed  to  turn  out  some  kind  of  code  that  can  still  be  transmitted  and  under¬ 
stood  by  one  of  the  two  major  remaining  sensory  channels,  namely  the 
tactile  or  the  auditory  channel. 

CODE  SELECTION  AND  COST  OF  MACHINE 

The  main  decision  one  has  to  make  in  designing,  or  even  conceiving,  one 
of  these  machines  is:  What  codes  should  be  selected?  The  codes  determine 
the  input  and  output  devices  as  well  as  the  “black  boxes”  in  between,  and 
therefore  the  cost.  And  of  course  when  it  comes  to  cost  matters  we  always 
hear  the  same:  “Such  a  machine  should  be  cheap,”  or  “It  should  be 
portable,”  or  whatever  other  yardstick  one  chooses  for  pointing  out  the 
fact  that  a  blind  person  should  be  able  to  buy  and  use  one.  Since  this 
cost  matter  is  so  important,  it  is  probably  necessary  to  assess,  at  least 
to  some  extent,  what  the  cost  of  such  a  machine  could  possibly  be — or 
rather  what  cost  such  a  machine  should  not  exceed. 


215 


216  Man-Machine  Systems 

It  is  not  possible  at  the  present  development  stage  of  this  rather  young 
field  to  look  to  existing  reading  machines  for  the  blind  and  derive  a  cost 
estimate  from  them;  there  are  too  few  machines  to  permit  this.  We  must 
then  use  an  indirect  method.  In  doing  this,  we  will  keep  in  mind  that  we 
have  two  different  types  to  consider.  One  is  the  “personal”  or  “portable” 
type  reading  machine  which  also  includes  the  “reading  aid”  which  Dr. 
Clowes  mentions.  The  other  is  the  so-called  “library”  machine  that  is 
usually  introduced  into  a  discussion  whenever  it  appears  that  the  cost  for  a 
really  useful  “personal”  type  machine  might  turn  out  to  be  prohibitive. 

What  should  a  personal  type  of  reading  machine  do?  Let  us  assume 
that  a  blind  person  hires  a  sighted  reader.  The  reader  could  be  quite  young 
and  inexperienced;  he  might  even  be  a  fifth-  or  sixth-grader.  Even  if  he 
happens  to  be  that  famous  Johnny  who  allegedly  can’t  read,  he  will  still 
be  a  mighty  good  “reading  machine.”  He  could  read,  for  example,  80  to  90 
words  a  minute.  He  could  read  all  type  fonts  and  even  handwritten  texts. 
He  could  scan  a  newspaper  quickly  and  find  the  headlines  and  give  you  a 
chance  to  select  what  interests  you.  He  could  explain  the  comics,  including 
the  pictures,  if  you  cared  enough  about  them. 

He  could  even  find  a  certain  stock  in  the  stock  market  column  quickly 
and  let  you  know  how  much  of  it  is  left.  All  of  these  accomplishments 
would  be  quite  tremendous  achievements  if  they  were  done  by  a  reading 
machine.  If  you  hire  this  young  man  for,  say,  two  hours  a  day,  pay  him 
a  dollar  an  hour  (which  he  would  probably  be  quite  happy  to  get),  it  would 
cost  you  60  dollars  a  month.  If  you  rented  a  reading  machine  instead  and 
paid  60  dollars  a  month,  and  assuming  that  you  are  paying  2Vi  percent 
of  its  total  value  as  a  monthly  rent  (a  figure  that  is  quite  typical  for  rental 
of  appliances),  the  total  value  or  price  of  this  machine  which  would  have 
to  replace  Johnny  could  only  be  2400  dollars — quite  a  sobering  thought. 

Let  us  go  one  step  further  and  determine  by  the  same  method  the  pos¬ 
sible  price  of  a  library  type  reading  machine.  If  the  library  hires  a  girl  with 
a  secretary’s  education;  pays  her  2  dollars  an  hour;  and  has  her  8  hours 
a  day;  we  end  up  with  an  equivalent  monthly  rent  of  480  dollars  or  a  figure 
of  approximately  20,000  dollars  as  the  possible  price  of  a  reading  machine 
which  the  girl  is  not  only  replacing,  but  which  she  outperforms  to  quite  a 
degree. 

By  comparison,  commercial  machines  used  in  industry  cost  much  more 
than  that;  they  come  in  all  sizes,  and  their  prices  range  from  something 
like  50,000  or  75,000  dollars  to  considerably  more.  Why  then  does  in¬ 
dustry  buy  these  machines,  although  they  are  in  some  respect  not  even  as 


Functional  Concepts  of  Reading  Machines  217 

smart  as  Johnny:  e.g.,  they  usually  have  “cooperative”  characters  for  an 
input  code,  characters  which  are  designed  so  the  machine  can  recognize 
them  easily?  The  answer  is  simple:  these  machines  replace  not  just  one 
person,  but  a  whole  group  of  persons,  by  operating  at  extremely  high  speed. 
Commercial  machines  read  so  fast  that  no  human  being  could  follow  their 
output.  They  do  not  read  for  people,  but  for  high  speed  printing  devices,  or 
for  other  machines  that  can  keep  up  with  them.  They  also  can  work  longer 
than  eight  hours  a  day  (if  you  have  an  operator  around  and  a  maintenance 
man  handy) .  So  the  high  price  is  justified. 

The  conclusion  is  that  with  reading  machines  for  the  blind  we  are  caught 
in  quite  a  cost  squeeze:  since  we  cannot  hope  that  any  machine  we  may 
develop  will  perform  as  well  as  Johnny  or  as  the  girl  in  the  library,  we  must 
keep  the  cost  of  the  “personal”  type  machine  considerably  below  2400 
dollars  and  the  cost  of  the  “library”  type  machine  considerably  below 
20,000  dollars.  Of  course,  compared  with  commercial  machines,  we  also 
have  some  alleviating  factors  regarding  costs:  for  instance,  we  not  only  may 
design  our  machines  to  read  at  a  lower  speed,  but  we  must  in  fact  make  them 
read  at  moderate  speed  if  we  are  to  understand  them.  We  can  also  plan,  if 
these  machines  become  a  reality,  to  produce  up  to  a  couple  of  hundred  of 
them  at  a  time.  This  moderate  quantity  production  will  permit  some 
reduction  in  cost  as  compared  with  commercial  machines,  of  which  there  are 
only  a  few  hundred  as  yet  in  this  country,  hardly  any  of  them  of  exactly  the 
same  type.  The  user  might  also  be  willing  to  pay  a  higher  price  for  the  in¬ 
dependence  a  reading  machine  will  give  him:  he  might  prefer  the  machine 
to  a  human  reader  for  all  kinds  of  personal  reasons.  For  one  thing,  he  could 
use  it  any  time  of  the  day  or  night;  there  might  be  some  mail  he  would  want 
to  keep  private;  he  may  want  to  check  his  bank  statements  himself;  and  so 
forth. 

These  are  alleviating  factors,  of  course,  but  against  them  one  must  keep 
in  mind  another  cost  advantage  of  the  commercial  machines :  the  latter  read 
not  only  from  a  cooperative  input  code,  but  they  need  only  produce  a  rather 
simple  output  code  understandable  to  other  machines — pulses,  or  groups  of 
pulses,  etc. — while  a  reading  machine  for  the  blind  must  prepare  an  output 
to  be  understood  by  the  auditory  or  tactile  sense  of  a  human. 

INPUT  AND  OUTPUT  CODES 

This  brings  us  back  to  the  question  of  codes.  Before  digressing  on  the 
matter  of  costs  I  mentioned  that  the  kinds  of  codes  determine  the  cost  of 
the  reading  machine.  The  input  code  for  a  reading  machine  for  the  blind  is, 


218  Man-Machine  Systems 

in  contrast  to  commercial  machines,  “noncooperative”;  it  is  not  tailor-made 
for  the  purpose  and  cannot  be  freely  selected;  it  already  exists  in  books, 
newspapers,  magazines,  typewritten  material,  and  handwriting.  Let  us  ex¬ 
clude  handwriting  from  our  considerations  because  reading  it  by  machine 
will  remain  impractical  for  some  time  to  come.  As  to  the  other  three  forms, 
namely  book  print,  newspaper  print,  and  typewriter  print,  we  made  a  survey 
and  found  that  in  the  book  trade  the  most  often  used  type  font  is  Caledonia, 
which  accounts  for  30  percent  of  the  type  used  in  books  printed  in  this 
country.  The  next  two  most  common  are  Baskerville  and  Janson.  Taken 
together,  these  three  fonts  cover  more  than  50  percent  of  the  type  used  in 
book  printing  in  the  United  States.  But  book  print  is  perhaps  not  the  most 
important  print  for  a  blind  person,  because  he  can  obtain  quite  a  number 
of  essential  books  in  braille  or  on  tape  or  through  the  Talking  Book  services. 
More  important  are  newsprint  and  magazine  because  of  their  topical  value. 
In  this  country,  fortunately,  it  happens  that  only  three  type  fonts,  Corona, 
Imperial,  and  Regal  (in  that  order)  account  for  75  percent  of  all  the  news¬ 
papers  and  magazines  printed.  In  the  case  of  typewriters,  a  similar  situation 
exists.  The  three  styles  of  Elite,  Pica,  and  IBM  Executive  account  for  more 
than  75  percent  of  the  typewritten  matter  produced. 

These  facts  do  ease  the  situation  in  the  case  of  the  reading  machine  for 
the  blind  somewhat,  and  we  can  of  course  take  advantage  of  the  findings 
by  making  the  machine  adaptable.  Since  the  reader  usually  knows  in  what 
font  his  newspaper  is  printed  (or  he  can  easily  find  out),  he  could  be  pro¬ 
vided  with  some  sort  of  printed  circuit  board  that  he  could  push  into  the 
reading  machine  when  he  changes  fonts.  He  would  not  need  an  excessive 
number  of  these  boards.  Of  course,  this  would  still  constitute  a  limitation 
on  the  reading  machine,  and  a  severe  one  at  that. 

With  respect  to  output  codes  the  situation  is  different.  We  can  choose 
the  output  code,  although  not  with  complete  freedom.  As  a  first  step  in  code 
selection,  we  can  choose  either  the  auditory  or  a  tactile  channel;  and  as  a 
second  step,  we  can  choose  for  each  channel  one  of  three  different  types  of 
machines.  They  are  usually  referred  to  as  the  “direct  translation”  machine, 
the  “recognition”  machine,  and  the  “integrating”  machine.  These  names 
are  not  very  descriptive,  so  I  shall  explain  them.  In  so  doing,  I  shall  assume 
an  auditory  output  and  deal  with  the  tactile  output  later. 

The  prime  example  of  the  direct  translation  machine  is  the  Optophone. 
The  term  stems  from  the  fact  that  the  machine  translates  the  letter  picture 
into  a  sound  picture  of  the  letter.  In  a  typical  design  there  is  a  vertical  row  of 
photocells  over  which  a  projected  image  of  each  letter  moves  while  the  de- 


Functional  Concepts  of  Reading  Machines  219 

vice  scans  the  printed  line.  To  each  photocell  is  assigned  an  audio  frequency; 
the  pitch  is  higher  the  higher  off  the  baseline  the  photocell  is  located.  If  the 
row  of  photocells  moves  slowly  over  a  letter,  a  series  of  signals  is  generated 
which  can  be  interpreted  by  the  “inner  ear,”  by  the  auditory  imagination, 
as  graphic  characters.  These  machines  are  relatively  inexpensive;  they  are 
also  relatively  slow.  The  main  reason  for  the  speed  limitation  is  the  fact 
that  they  yield  about  three  intensity  peaks  per  letter  on  the  average.  Thus,  in 
traversing  a  lower  case  “h,”  for  example,  there  is  encountered  first  a  vertical 
bar,  then  a  horizontal  line,  then  another  shorter  vertical  bar.  Each  of  these 
three  features  gives  a  distinct  signal  and  each  must  be  interpreted  and  added 
to  the  next  before  the  letter  “h”  can  be  recognized  by  the  human  reader.  As 
you  will  see  below,  there  are  definite  limitations  on  the  number  of  signals  the 
ear  can  interpret  within  a  given  time.  With  an  average  of  three  signals  per 
letter  one  can  count  on  a  maximum  of  30  words  a  minute  or  so — which 
figure  coincides  with  the  findings  of  investigators  using  this  type  of  reader  to 
date.  Let  me  repeat,  however,  that  the  machine  is  inexpensive  and  simple. 

Recognition  machines  are  altogether  different,  and  resemble  in  fact 
commercial  machines.  Although  there  are  some  compromises  one  can  make 
in  a  recognition  machine  for  the  blind,  which  I  will  point  out  later,  they  do 
the  same  job  as  commercial  machines;  they  recognize  a  letter.  What  does 
that  mean?  As  Rabinow  points  out,  there  are  three  basic  ways  of  recognizing 
letters:  by  line  tracing,  by  feature  analysis,  and  by  mask  matching.  All  of 
these  methods  presuppose  that  inside  those  “black  boxes”  I  spoke  of  before 
there  is  some  kind  of  memory  in  which  is  stored  something  against  which  a 
match  is  made.  The  storage  may  take  many  forms:  masks;  a  network  of 
resistors;  a  diode  matrix;  or  something  else.  Of  course  these  memory  and 
matching  elements  add  to  the  cost  of  the  machines.  These  machines  are 
relatively  fast;  they  usually  give  one  signal  per  letter.  This  feature  triples 
the  speed  of  reading  as  compared  with  the  Optophone.  It  also  triples  the 
cost — or  more. 

Integrating  machines  have  been  described  as  having  the  advantages  of 
both  the  direct  translation  and  the  recognition  type  machine,  and  as  avoid¬ 
ing  the  disadvantages  of  both.  The  machine  proposed  by  Dr.  Clowes,  with 
“the  interface  put  on  the  feature  level”  and,  as  he  says,  “stopping  short  of 
full  recognition,”  would  be  this  type  of  machine.  In  short,  the  integrating 
machine  integrates  certain  features  of  the  character  into  one  signal,  which 
the  user  must  learn  to  recognize.  This  requirement  poses  a  training  problem. 
But  this  type  of  machine,  which  is  potentially  cheaper  than  the  recognition 
type  and  probably  more  expensive  than  the  direct  translation  type,  could 


220  Man-Machine  Systems 

possibly  be  as  fast  as  the  recognition  type,  depending  on  the  code  it  pro¬ 
duces. 

This  classification  of  reading  machines  doesn’t  really  say  very  much 
about  the  codes  which  they  can  or  cannot  typically  produce  as  an  output. 
The  classification  also  seems  somewhat  arbitrary  and  may  be  incomplete. 
In  addition,  it  is  not  immediately  apparent  how  to  apply  this  breakdown  to 
the  tactile  type  of  output.  It  may,  therefore,  help  clarify  the  situation,  if 
we  consider  for  a  few  moments  the  sensory  restrictions  on  the  selection 
of  the  output  code. 

SENSORY  RESTRICTIONS  ON  OUTPUT 

CODE  SELECTION 

As  I  said,  the  output  code  can  be  chosen  with  a  certain  amount  of  freedom; 
but  there  are  some  restrictions.  The  printed  text  is  of  course  a  code  which 
presents  the  information  to  the  human  eye  in  a  way  that  is  very  well  suited 
for  this  purpose.  Likewise,  tactile  codes  have  to  be  suited  to  the  trans¬ 
mission  abilities  of  the  human  skin;  auditory  codes  must  be  suited  to  the 
design  of  the  human  ear.  This  sounds  obvious,  but  it  is  not  always  fully 
recognized. 

In  considering  the  function  of  the  eye  in  looking  at  a  printed  character, 
and  disregarding  all  other  cues  such  as  color,  brightness,  etc.,  except 
the  presence  or  absence  of  black  at  each  location  of  the  visual  field,  we 
can  drastically  simplify  the  representation  of  the  visual  field  within  the 
optical  nerve  by  saying  that  each  of  its  nerve  fibers  could  be  “tagged”  as 
to  the  location  of  its  end  organ  on  the  retina.  Similarly,  we  can  “tag”  the 
nerve  fibers  leading  to  the  skin  as  to  the  location  of  their  associated  pressure 
receptors  on  the  body  surface.  But  the  situation  is  changed  in  the  ear.  Here 
the  nerve  fibers  end  in  the  cochlea  where  they  are  associated  with  certain 
sound  frequencies;  and  these  are  not  interpreted  by  the  sense  of  hearing  as 
a  spatial  location  as  is  the  case  with  the  senses  of  vision  and  touch.  With 
respect  to  the  transmission  of  information,  however,  one  can  group  these 
locations  in  the  cochlea  together  with  the  spatial  location  of  nerve  fiber  end 
organs  in  the  retina  and  on  the  surface  of  the  skin. 

As  soon  as  one  speaks  of  spatial  location  of  sensory  organs,  the  question 
of  resolution  arises.  The  eye  has  an  astounding  resolution;  for  one  thing, 
it  is  two-dimensional;  and,  secondly,  it  can  distinguish  two  targets  as  little 
as  one  minute  of  arc  apart,  which  explains  why  we  are  able  to  read  the  fine 
print  on  an  insurance  policy.  The  resolution  of  the  skin  is  also  two-dimen¬ 
sional,  but  by  no  means  as  good  as  that  of  the  eye;  this  is  the  reason  why  I 


Functional  Concepts  of  Reading  Machines  22 1 

queried  Dr.  Clowes  whether  a  user  of  his  proposed  device  has  to  wiggle 
his  finger  when  he  wants  to  identify  its  embossed  letters.  In  reading  em¬ 
bossed  letters,  it  is  usually  admitted  that  such  wiggling  is  necessary,  be¬ 
cause  the  resolution  of  the  skin  at  the  fingertip  is  just  not  quite  good  enough 
for  the  task.  This  is  the  reason,  by  the  way,  why  braille  was  conceived. 
Recent  investigation  has  indicated  braille  is  not  only  adequate  in  this  re¬ 
spect;  in  fact  it  hardly  admits  of  improvement.  It  is  just  about  the  ideal  way 
to  make  optimal  use  of  the  available  resolution  of  the  fingertips  for  getting 
64  answers  by  using  all  possible  variations  of  a  six-dot  array.  With  respect 
to  the  ear,  the  resolution  is  not  two-dimensional,  as  with  the  skin  and  the 
retina;  it  is  one-dimensional  and  consists  just  in  the  discrimination  of  pitch. 
There  are,  therefore,  not  more  than  a  few  hundred  possible  threshold  de¬ 
tection  differences  of  pitch  available  for  a  code  to  be  used  in  a  reading  ma¬ 
chine  with  an  audible  output. 

In  addition  to  spatial  resolution  there  is  another  type  of  resolution  for  all 
three  of  these  senses :  a  resolution  in  time,  which  is  usually  called  “temporal 
resolution.”  In  practice,  this  is  expressed  as  the  frequency  at  which  a  signal 
can  be  repeated  before  fusion  occurs  between  two  successive  signals.  The 
requirements  regarding  temporal  resolution  of  the  Optophone  type  machine 
is  what  I  referred  to  before  as  one  of  its  inherent  limitations.  All  of  the 
senses  are  about  equally  poor  in  temporal  resolution;  it  is  obviously  a  prop¬ 
erty  of  the  human  nervous  system  that  when  one  goes  beyond  15  or  so 
signals  per  second,  fusion  takes  place.  The  point  is  about  the  same  for  each 
sense:  with  over  15  or  20  flashes  of  light  per  second,  we  see  no  longer  a 
flicker  but  a  continuous  light  impression;  with  the  ear,  one  gets  a  buzz  at 
that  repetition  rate;  and  in  the  skin  it  gives  rise  to  a  sense  of  vibration  rather 
than  that  of  a  sequence  of  single  pressure  pulses.  If  the  single  signals  are  not 
just  pulses,  but  carry  more  than  one  bit  of  information  and  it  is  therefore 
essential  that  one  interprets  each  successive  signal  (as  for  instance  in 
a  sequence  of  letters),  then  the  rate  of  presentation  must  be  halved.  Eight 
signals  per  second  is  about  the  fastest  speed  one  can  use.  For  a  continuous 
series  of  signals  following  one  another,  calculation  demonstrates  this  to 
constitute  the  well-known  figure  of  80  or  90  words  per  minute.  This  is  three 
times  the  speed  of  the  Optophone  when  used  by  a  skilled  reader.  We  can  of 
course  understand  speech  at  much  higher  rates,  but  the  signal  unit  here  is  a 
syllable,  not  a  letter.  This  fact  introduces  another  threefold  increase  in  speed, 
for  something  like  250  words  a  minute  in  speech. 

An  important  generalization  is  possible  based  on  these  considerations: 
since  the  spatial  resolution  of  the  three  senses  discussed  is  much  superior 


222  Man-Machine  Systems 

to  their  temporal  resolution,  it  is  much  more  useful  to  utilize  many  parallel 
channels  into  the  brain  simultaneously  than  to  present  information  as  a  time 
sequence  of  discrete  events  in  one  channel.  Or,  as  Dr.  Clowes  has  pointed 
out,  the  more  independent  attributes  or  the  more  independent  nerve  stimuli 
are  transmitted  to  the  brain,  the  more  information  can  be  conveyed.  There 
is,  however,  an  important  qualification  regarding  parallel  fiber  excitation. 

Visualize  a  checker  board,  subdivided  8  by  8  into  64  fields,  and  allow 
each  field  to  be  either  black  or  white.  Now  of  course  this  gives  a  total 
of  2W  or  approximately  1020  possible  combinations  of  black  and  white  fields 
on  the  checker  board.  While  an  electronic  scanner  would  have  no  trouble 
identifying  any  one  of  these  combinations,  the  eye  would  classify  the  vast 
majority  of  them  just  as  mottled,  mixed-up  checker  boards.  And  if  you 
would  have  two  of  these  mottled  checker  boards  which  would  differ  by  just 
a  few  of  the  black  or  white  fields  having  been  exchanged,  it  would  hardly 
be  possible  to  use  them  as  two  characters  of  a  code  and  have  the  eye  read 
them.  But  as  soon  as  you  arrange  the  black  or  white  fields  into  lines  or  geo¬ 
metric  figures,  the  eye  becomes  quite  powerful  in  distinguishing  differences 
and  recognizing  configurations.  It  thus  becomes  apparent  that  the  eye  and 
the  associated  vision  center  in  the  human  brain  must  have  established 
during  the  development  of  the  human  species  certain  selective  preferences 
for  quite  definite  patterns  of  parallel  excitation  of  the  optical  nerve  fibers 
which  correspond  to  certain  definite  arrangements  of  contrasting  spots  in 
the  visual  field  which  we  commonly  call  “shapes.”  This  is  similar  to  what 
psychologists  refer  to  as  “Gestalt”  perception  (“Gestalt”  means  “shape”  in 
German)  and  we  must  take  this  factor  into  consideration.  With  the  tactile 
sense  we  find  a  similar  selective  preference  for  pre-established  “shapes,” 
only  their  definition  is  much  coarser  because  the  resolution  is  much  less 
fine  than  in  the  case  of  the  eye.  Yet  there  is  a  notable  coincidence  between 
the  pre-established  shapes  perceived  by  our  senses  of  vision  and  touch,  re¬ 
spectively,  which  is  not  surprising  because  these  shapes  have  been  derived 
from  the  same  source,  the  physical  objects  in  our  environment.  This  coin¬ 
cidence  is  the  reason  for  the  recurring  proposals  to  use  embossed  letters  as 
a  reading  machine  output,  which  is  a  fine  idea,  if  the  resolution  problem 
can  be  solved. 

With  the  ear  a  totally  different  situation  obtains,  although  basically  it 
works  in  the  same  way.  If  we  oversimplify  somewhat  and  consider  the 
cochlea  the  recognition  instrument,  we  can  feed  to  it  an  arbitrary  com¬ 
bination  of  audio  frequencies  and  expect  the  cochlea  to  distinguish  it  from 
a  second  arbitrary  frequency  combination.  Most  likely  the  cochlea  will 


Functional  Concepts  of  Reading  Machines  223 

classify  both  combinations  as  “noise.”  If  we  compose  the  frequencies  more 
planfully,  however,  the  cochlea  will  identify  the  squeaking  of  a  door  and 
distinguish  it  from  the  sound  of  a  violin.  Again  one  encounters  the  fact 
that  a  selective  preference  for  certain  pre-established  frequency  combina¬ 
tions  determines  what  the  cochlea  will  identify  and  will  not  call  noise.  And 
there  is  one  particularly  sophisticated  group  of  frequency  combinations 
which  happens  to  be  the  code  we  use  in  speaking,  namely  the  phonemes, 
which  the  cochlea  can  interpret  quite  easily  and  fluently.  Thus,  in  a  quasi- 
spatial  sense,  we  can  call  the  phonemes  the  “shapes”  which  the  ear  per¬ 
ceives  and  for  which  it  is  optimally  suited  in  information  transmission. 
However,  there  is  no  coincidence  whatsoever  between  these  “shapes”  and 
the  shapes  of  printed  characters  as  perceived  by  the  eye  or  by  touch. 

CONCLUSIONS 

What  does  this  all  mean  if  we  apply  our  discussion  to  the  three  different 
types  of  reading  machines?  What  conclusions  can  we  draw  from  this  di¬ 
gression  into  the  sensory  field?  For  one  thing,  the  term  “direct  translation” 
becomes  more  meaningful.  It  defines  machines  in  which  the  shape  of  a 
visual  character  is  retained  when  it  is  translated  into  tactile  or  auditory 
stimuli.  In  the  tactile  case,  a  direct  translation  machine  would  sense  the 
visual  shape  of  a  letter  by  having  photocells  detecting  black  wherever  it 
occurs  and  raise  corresponding  pins  in  a  display  which  the  fingers  touch. 
Thus,  the  machine  would  provide  an  actual  reproduction  of  the  graphical 
shape  of  the  character,  but  in  a  form  directly  accessible  to  the  sense  of 
touch.  Reading  machines  of  this  general  nature  have  been  proposed  in 
various  forms,  all  of  them  hampered  somehow  in  their  performance  by 
the  limited  resolution  of  the  sense  of  touch.  However,  as  Dr.  Kallmann  has 
pointed  out,  this  should  not  obscure  the  fact  that  the  principle  of  the  tactile 
direct  translation  machine  is  well  suited  for  conveying  pictorial  information 
to  the  blind  person. 

In  the  auditory  case,  as  represented  by  the  Optophone,  the  term  “direct 
translation”  is  applicable  only  by  stretching  its  meaning  a  point  or  two. 
True  direct  translation  into  an  auditory  code  would  be  present  if  one  as¬ 
sociated  a  certain  audio  frequency  with  every  single  cell  in  a  mosaic  of 
photocells  onto  which  a  letter  image  is  projected.  Those  cells  which  would 
“see”  black  would  generate  their  particular  frequency,  thus  producing  a 
typical  compound  sound  for  each  letter  shape.  The  catch  in  the  method 
comes  to  light  when  we  discover  that  most  of  the  compound  sounds  that  we 
would  hear  would  be  interpreted  by  the  ear  as  “noise.”  As  was  mentioned 


224  Man-Machine  Systems 

before,  there  is  simply  no  correlation  between  the  shape  of  a  letter  as  ex¬ 
pressed  by  an  arrangement  of  black  areas  on  a  two-dimensional  surface, 
and  the  “shape”  of  a  phoneme  as  expressed  by  a  combination  of  certain 
audio  frequencies.  Thus  a  direct  translation  of  a  letter  shape  as  seen  by  the 
eye  into  a  compound  sound  which  the  ear  would  interpret  as  a  phoneme 
would  be  quite  a  coincidence.  An  attempt  could  conceivably  be  made  to 
assign  certain  frequencies  in  a  judicious  way  to  the  cells  of  that  photocell 
mosaic,  maybe  by  using  the  rules  of  speech  synthesis  worked  out  by  the 
Haskins  Laboratories;  we  might  even  achieve  the  result  that  a  few  letters 
would  sound  like  phonemes,  but  the  rest  would  surely  be  interpreted  as 
noise  by  the  ear.  There  is  little  hope  that  such  a  direct  translation  will  ever 
be  achieved  for  a  sufficient  number  of  letter  shapes  because  it  is  simply  not 
applicable  to  the  eye/ear  code  conversion  problem. 

However,  the  Optophone  does  something  similar  to  this,  and  it  is  rather 
unique  in  this  respect.  Its  working  principle  makes  use  of  the  fact  that 
the  ear  is  the  only  one  of  the  three  senses  where  not  only  parallel  fiber 
excitation,  but  also  the  interpretation  of  time  sequences  of  signals,  con¬ 
tributes  significantly  to  the  identification  of  perceptual  “shapes,”  such  as 
phonemes.  If  phonemes  are  fed  into  a  set  of  filters  and  analyzed  harmonic¬ 
ally,  and  the  result  recorded  on  a  strip  of  paper  as  a  so-called  spectrogram, 
we  find  only  a  very  few  phonemes  that  are  composed  of  just  a  steady 
frequency  mixture.  These  are  usually  called  “vowels.”  As  soon  as  conso¬ 
nants  are  introduced  the  spectrogram  shows  no  longer  a  static  composition, 
but  ups  and  downs  or  bursts  of  energy  and  the  like,  and  we  know  that  the 
ear  is  perfectly  capable  of  interpreting  these  variations  along  the  time  axis. 
The  Optophone  utilizes  this  ability  of  the  ear  in  a  unique  and  ingenious 
way — and  incidentally  deviates  in  so  doing  partially  from  the  recommenda¬ 
tion  I  made  above  to  avoid  the  time  axis  and  to  use  instead  multiple  chan¬ 
nels  for  information  transmission.  In  this  machine,  only  the  vertical  row 
of  photocells  is  designed  to  produce  signals  for  multiple  channels  by  gen¬ 
erating  the  various  frequency  components,  but  the  horizontal  motion  pro¬ 
duces  variations  along  the  time  axis  with  the  result  that  the  “inner  ear”  ex¬ 
periences  an  almost  visible  picture  of  the  letter. 

This  machine  is  unique,  as  I  have  said;  and  I  would  also  venture  to  say 
that,  in  the  area  of  direct  translation  from  the  visual  to  an  auditory  code, 
there  seems  hardly  any  other  solution  possible  that  would  be  comparable 
to  the  Optophone.  However,  there  appears  to  be  some  room  for  improve¬ 
ment  of  the  latter,  by  selecting  those  frequencies  that  prove  to  be  particu¬ 
larly  adequate  and  this  is  being  done. 

Now  let’s  take  a  look  at  the  so-called  “recognition”  machines.  As  men- 


Functional  Concepts  of  Reading  Machines  225 

tioned  before,  this  type  of  machine  recognizes  characters.  It  compares  their 
shapes  with  the  contents  of  certain  memory  elements  inside  the  machine  and 
comes  up  with  so-and-so  many  answers,  26  or  whatever  number  of  char¬ 
acters  it  is  designed  to  recognize.  Once  recognition  is  achieved,  the  shape  of 
the  character  becomes  irrelevant  and  need  not  influence  the  tactile  or  audi¬ 
tory  output  code  as  it  did  in  the  case  of  the  direct  translation  machines. 
These  codes  can,  therefore,  be  chosen  freely  according  to  their  own  merits, 
which  seems  to  present  us  with  quite  a  cornucopia  of  possibilities.  In  reality, 
the  choice  is  severely  limited  by  considerations  regarding  cost,  training  re¬ 
quirements,  sensory  restrictions,  etc.  On  the  tactile  side  the  braille  code  is 
in  my  opinion  the  only  possible  choice.  It  is  compatible  with  the  resolution 
capabilities  of  the  sense  of  touch;  training  experience  exists;  reading  speeds 
of  80  to  90  words  per  minute  are  possible;  and  the  resulting  output  device 
is  relatively  simple  and  inexpensive.  The  only  disadvantage  of  the  braille 
code  (and  this  applies  most  probably  to  all  tactile  output  codes)  is  the  fact 
that  it  ties  up  the  hands,  which  as  we  will  see  below  are  urgently  needed  for 
other  jobs  in  reading  machines  for  the  blind. 

For  an  auditory  code  there  is  no  doubt  that  any  arbitrary  frequency 
composition  would  be  inadequate  in  a  recognition  machine  and  that 
phonemes  of  some  sort  are  the  most  appropriate  choice.  For  a  library 
machine  where  the  cost  problem  is  not  quite  so  severe,  the  auditory  output 
could  be  an  artificial  language,  such  as  produced  by  using  the  rules  of 
speech  synthesis  developed  by  the  Haskins  Laboratories,  sometimes  called 
machine  English.  Or  one  can  use  the  recognized  characters  as  a  means  for 
looking  up  prerecorded  words  in  a  magnetic  tape  dictionary  and  play  them 
back  as  an  auditory  output.  Both  approaches  might  eventually  permit 
reading  speeds  close  to  real  English.  For  a  personal  type  of  recognition 
machine,  the  cost  problem  forces  us  to  compromise.  In  the  English  language, 
where  there  is  quite  a  discrepancy  between  the  spelling  and  the  pronuncia¬ 
tion  of  a  word,  a  spelled  speech  alphabet  (such  as  that  developed  by  Dr. 
Metfessel)  seems  to  us  the  only  economically  feasible  way  of  conveying 
auditory  information  from  this  type  of  machine  to  the  human  brain.  In 
other  languages,  where  the  correlation  between  spelling  and  pronunciation 
is  closer,  one  could  possibly  use  the  phonemes  themselves  which  might  even 
contribute  something  to  the  reading  speed.  In  any  event,  80  to  90  words  per 
minute  is  probably  the  rate  which  we  can  reasonably  expect  with  this  type  of 
device. 

This  is  the  kind  of  machine  we  are  working  on,*  and,  in  order  to  stay 


*  Under  a  contract  with  the  Veterans  Administration. 


226  Man-Machine  Systems 

within  the  cost  limitations,  we  have  to  make  maximum  use  of  the  hands  of 
the  blind  reader.  This  precludes  the  use  of  a  braille  output  code,  the  reader 
will  have  to  adapt  the  machine  to  a  given  type  font  and  he  will  have  to  adjust 
the  optical  scanning  system  to  the  letter  size.  The  text  alignment  would  be, 
again,  the  reader’s  task,  for  this  alignment  is  related  to  the  size  requirement. 
The  same  applies  to  the  line  feed.  He  will  also  move  the  scanner  by  hand 
and  will  adjust  its  speed  to  his  own  desire:  if  he  is  a  proficient  reader,  he 
can  move  it  quickly;  whereas  if  he  did  not  understand  what  he  has  read,  he 
can  go  back  and  start  again,  reading  it  twice.  In  general,  of  course,  we 
also  hope  that  the  development  of  commercial  reading  machines  will  some 
day  provide  us  with  various  elements  which  we  can  adapt  to  or  plug  into 
our  machines  and  which  will  be  relatively  inexpensive. 

To  complete  this  discussion,  I  should  mention  finally  the  so-called  “inte¬ 
grating”  machines.  These  are  fascinating  to  consider.  They  are  based  on 
the  concept  that  it  should  be  possible  to  integrate  the  salient  features  of  a 
character  into  one  compound  signal  and  to  use  the  various  compound  signals 
derived  from  all  the  characters  as  the  output  code  of  the  machine.  This 
concept  places  these  machines  between  the  “direct  translation”  and  the 
“recognition”  machines,  and  the  term  “integrating”  machine  is  sometimes 
used  as  a  catch-all  term  for  any  machine  that  does  not  fall  into  either  of 
the  other  two  categories. 

Integrating  machines  with  a  tactile  output  are  hard  to  visualize  and,  as  to 
my  knowledge,  have  never  been  proposed.  On  the  other  hand,  we  ourselves 
tried  two  versions  of  them  with  an  auditory  output,  without  much  success. 
It  turned  out  that  we  had  been  much  too  optimistic  in  our  assumptions  re¬ 
garding  the  capability  of  the  human  ear  to  interpret  compound  signals  other 
than  phonemes.  We  are  now  convinced  that  the  concept  is  problematical 
in  principle.  But,  maybe  we  are  now  too  pessimistic. 

Let  me  say  in  conclusion,  that  it  does  not  appear  appropriate  to  state 
at  this  time  that  any  one  of  these  machines  is  inherently  better  than  the 
others.  It  is  still  much  too  early  in  the  development  of  such  reading  ma¬ 
chines  for  anyone  to  say  this.  All  three  of  the  systems  proposed  have  their 
merits;  all  of  them  should  be  pursued  and  supported  at  the  present  time; 
and  we  should  let  the  course  of  time  help  us  to  decide  what  the  best  system 
is  like.  Perhaps  we  shall  end  our  hard  work  with  one  excellent  example  of 
each  of  these  three,  eventually. 


LETTER  SCANNING  FOR 
CHARACTER  RECOGNITION 


SAMUEL  A.  SCHARFF 

Consulting  Engineer,  Englewood,  New  Jersey 


INTRODUCTION 

The  subject  of  this  paper  is  the  sensing  process,  whereby  a  printed  char¬ 
acter  on  a  piece  of  paper  is  sensed  by  an  optical  character  reading  system. 
How  the  system  later  recognizes  the  character,  and  how  it  communicates 
the  character  to  a  blind  user,  are  the  subjects  of  other  papers  in  this  Con¬ 
gress.  My  discussion  will  proceed  in  an  engineering,  rather  than  a  scientific, 
frame  of  reference.  Accordingly,  my  biases  and  prejudices  will  be  exercised 
and  I  will  be  concerned  with  costs,  relative  complexity,  and  other  matters 
of  no  theoretic  importance. 

Sensing  processes  can  be  grouped  into  three  broad  classifications, 
though  not  without  some  overlaping.  These  are:  sequential  scanning 
schemes;  parallel  sensing  schemes;  and  contour  tracing  schemes.  Each  is 
discussed  in  one  of  the  three  following  main  sections.  The  fourth  section 
touches  on  some  general  matters  of  interest  in  connection  with  sensing 
processes,  and  sums  up  the  relative  merits  of  the  schemes  discussed. 

Two  shorthand  expressions  which  occur  in  the  discussion  should  be 
considered  at  the  outset.  First,  the  “exemplar”  is  a  pattern  presented  to 
the  character  reading  system  for  recognition.  It  ought  to  be  a  properly 
formed  alphanumeric  character,  symbol,  or  punctuation  mark;  but  it  may 
be  distorted  or  it  may  be  a  nonsense  shape.  Second,  a  character  reading 
system  may  be  designed  to  handle  one  or  more  of  what  I  will  refer  to  as 
the  “three  types  of  characters.”  These  are: 

(1)  printed  type,  with  fixed  type  fonts  (sometimes  a  special  type  font 
designed  along  with  the  character  reading  system;  sometimes 
standard  typewriter,  printers’,  or  other  fonts) 

(2)  hand  printed  block  capital  letters  and  numerals 

(3)  handwritten  (cursive)  letters. 


227 


228  Man-Machine  Systems 

SEQUENTIAL  SCANNING  OF  THE 

EXEMPLAR  SPACE 

In  this  section,  we  will  consider  sensing  schemes  which  operate  by  scan¬ 
ning,  in  sequence,  all  the  elements  in  the  exemplar  space.  Parallel  sensing 
of  the  whole  exemplar  space  will  be  considered  in  the  third  section  of  this 
paper,  and  in  the  fourth  section  we  will  touch  on  contour  tracing  schemes, 
which  in  effect  ignore  everything  but  the  outline  of  the  figure  of  interest. 

Mechanical  Scanning 

Mechanical  scanning  was  practical  before  electronic  techniques  were  de¬ 
veloped,  so  it  is  not  surprising  that  early  scanning  systems  were  all  me¬ 
chanical.  The  continued  interest  in  mechanical  scanning  shows  that  it  has 
other  corollary  virtues  which  we  might  note  before  considering  some  ex¬ 
amples. 

First,  mechanical  scanning  may  lend  itself  to  the  cause  of  the  inventor 
with  little  cash  and  a  small  workshop.  He  can  get  a  solid  working  equip¬ 
ment  with  little  out-of-pocket  expense;  or  he  can  put  together  a  mock-up 
model  without  much  effort.  Then  too,  the  natural  selection  aspects  of  the 
process  of  obtaining  funds  heavily  favors  mechanical  as  opposed  to  elec¬ 
trical  or  chemical  technique.  Apparently  it  is  far  easier  to  commit  money 
when  you  can  watch  the  wheels  go  around  and  can  trace  their  functioning 
than  when  staring  at  a  dumb  electronism  which  at  the  most  can  only  blink 
or  give  off  a  greenish  glow.  The  latter  effusions  do  not  often  lead  to  irresisti¬ 
ble  surges  of  generosity.  Finally,  production  of  mechanical  systems  is  readily 
arranged  for,  need  not  be  expensive,  and  can  result  in  equipment  which 
can  be  understood,  for  maintenance  purposes,  by  a  relatively  large  group 
of  technicians:  watch  repairmen,  automobile  mechanics,  typewriter  repair¬ 
men,  and  so  on. 

All  mechanical  scanning  schemes  depend  on  relative  motion  of  a  point 
with  respect  to  a  space  in  which  the  patterns  to  be  recognized  are  pre¬ 
sented — a  space  we  might  call  the  exemplar  space.  The  point  in  relative 
motion  may  be  a  bright  point  of  light  or  an  aperture  through  which  light 
from  different  elements  can  pass;  in  these  cases  a  stationary  photocell  with 
a  field  of  view  encompassing  the  entire  space  performs  a  bit-by-bit,  optical- 
to-electrical  conversion.  Alternatively,  the  point  in  relative  motion  may 
be  the  photocell  itself;  if  small  compared  to  the  space,  or  if  fitted  with 
optics  restricting  the  energy  it  receives  to  that  from  a  point  in  the  space, 
relative  motion  of  the  cell  can  effect  the  scan. 

Both  of  these  approaches  have  been  used.  Each  can  be  instrumented 


Letter  Scanning  for  Character  Recognition  229 

in  a  large  number  of  ways.  The  following  sections  illustrate  some  of  the 
possibilities.  We  consider  first  some  slotted  disc  scanners;  then  drum  scan¬ 
ners;  and  finally  a  relative  motion  cell  device. 

Slotted  Discs.  The  first  of  the  mechanical  scanning  schemes  is  based  on  some 
version  or  combination  of  spiral  and  radial  slotted  discs.  These  may  be  lo¬ 
cated  in  a  projection  system  so  that  a  moving  spot  of  light  will  be  projected 
on  the  exemplar  space  which  is  all  within  the  collection  zone  of  a  detector. 
The  discs  may  be  located  in  the  collection  path  also.  There  are  many  vari¬ 
ations. 

The  Farrington  (Intelligent  Machines  Research)  Company  character 
reading  machines  employ  variations  on  this  sort  of  scan  (22).  One  system, 
developed  to  sort  letters  by  city  of  address  for  the  U.  S.  Post  Office  De¬ 
partment,  substitutes  motion  of  the  envelope  for  the  rotation  of  one  disc. 
One  interesting  aspect  of  this  machine  not  directly  related  to  our  present 
purpose  may  be  noted  in  passing:  it  ‘recognizes’  words  rather  than  letters. 
Individual  letters  are  classified  as  (1)  having  ascenders  (b,  d,  t,  etc.); 
(2)  having  descenders  (j,  q,  p,  y,  etc.) ;  (3)  having  three  crossings  of  vertical 
(a,  e,  s,  y);  and  (4)  lower  case  q.  The  combined  sequential  occurrences 
of  the  four  classes  allow  the  machine  to  assign  the  envelope’s  address  to 
one  of  40  cities.  It  rejects  doubtful  cases  and  also  any  handwritten  ad¬ 
dresses.  The  rate  of  operation  is  about  10,000  envelopes  an  hour,  or  about 
three  envelopes  per  second. 

This  type  of  scanning  need  not  be  limited,  as  in  this  example,  to  testing 
for  the  presence  of  four  characteristics.  It  could  be  designed  to  examine  in 
succession  each  element  of  the  exemplar  space,  producing  a  time  varying 
electrical  signal  which  would  be  the  transformed  equivalent — a  mapping 
into  electrical  form — of  the  pattern  in  the  exemplar  space.  If  used  in  this 
fashion,  the  scheme  could  produce  electrical  data  on  any  of  the  three  types 
of  characters.  The  minimum  element  size  would  be  fixed  by  the  precision 
of  the  components  and  the  ingenuity  of  the  designers:  the  minimum  ele¬ 
ment  size  in  the  Farrington  machine  is  0.005  inch  ( 1 ) . 

The  Farrington  scanning  scheme  is  fairly  simple  in  concept.  Com¬ 
plications  multiply  as  the  designer  digs  into  the  problems  of  recognition, 
and  especially  into  recognition  of  imperfect  characters.  Some  of  the  com¬ 
plications  can  be  handled  by  adding  complications  to  the  scanning:  we 
find  features  to  cope  with  slanted  characters,  characters  not  printed  squarely 
on  the  base  line,  base  lines  not  spaced  regularly,  base  lines  slanted,  and 
so  forth  (2).  Scanner  complexity  can  be  reduced  of  course  by  the  use 
of  more  elaborate  ‘logic’  for  the  recognition  process. 


230  Man-Machine  Systems 

The  speeds  attainable  with  flying  spot  mechanical  scanning  are  high 
enough  that  paper  handling  difficulties  limit  the  system  performance.  An 
early  Farrington  machine  handled  200  double-spaced  typewritten  pages 
per  hour.  The  Post  Office  device  uses  a  17,000  rpm  scanning  disc  of  7-inch 
diameter  with  44  radial  slots  which  is  fast  enough  to  ‘read’  envelopes 
moving  at  45  inches  per  second. 

The  photocell  output  of  these  scanning  systems  is  suitable  for  ‘logic’ 
operation  in  either  digital  or  analogue  form.  The  Farrington  systems  use 
both.  Serial-to-parallel  conversion  circuitry  and  buffer  code  conversion  or 
storage  circuitry  is  available,  if  needed,  to  fit  in  with  over-all  system  de¬ 
signs.  Imperfect  characters  cause  no  special  difficulty  to  scanners  of  this 
type,  as  noted  above,  however  difficult  and  complex  the  ‘logic’  of  recogni¬ 
tion  may  then  have  to  become. 

Drum.  The  second  mechanical  scanning  scheme  is  based  on  a  rotating  drum. 
A  page  bearing  the  characters  to  be  scanned  is  mounted  on  the  drum.  A  mov¬ 
ing  spot  or  light,  or  a  moving  photocell,  travels  along  a  line  parallel  to  the 
drum  axis  while  the  drum  rotates;  thus  the  entire  page  can  be  scanned. 
This  is  the  basic  mechanism  of  most  facsimile  transmitters. 

A  1957  investigation  of  the  problems  of  recognition  of  generalized 
patterns  (not  restricted  to  characters)  at  the  U.  S.  National  Bureau  of 
Standards  used  this  scheme  (14).  It  examined  a  44  mm  by  44  mm  area  in 
terms  of  30,976  square  elements,  each  a  quarter-millimeter  on  a  side 
which  it  classified  as  ‘light’  or  ‘dark.’  In  less  than  25  seconds  the  entire 
pattern  was  scanned  and  the  data  stored  in  704  words  of  the  NBS  SEA 
computer  store.  Just  recently  the  Cornell  Aeronautical  Laboratory  indi¬ 
cated  that  a  somewhat  similar  approach  is  being  used  there  ( 19) .  A  facsimile 
transmitter  examines  a  5-inch  square  zone  with  a  resolution  of  100  lines 
per  inch,  producing  data  on  250,000  cells.  Each  is  reported  in  terms  of  a 
16-level  scale  of  intensity.  The  scan  takes  90  seconds.  Thus  the  scanner 
produces  an  output  of  a  million  binary  bits  of  information,  and  this  is  fed 
to  an  IBM  704  computer  for  processing. 

This  scanning  scheme  can  produce  electrical  data  on  any  of  the  three 
types  of  characters.  The  resolution,  as  noted,  can  be  0.01  inch,  one-half  of 
one  percent  of  the  exemplar  space.  The  scheme  is  fairly  simple,  possibly  less 
so  than  the  flying  aperture  schemes  already  discussed  as  far  as  its  me¬ 
chanism  is  concerned.  It  lends  itself  less  conveniently  than  do  the  flying 
aperture  versions,  perhaps,  to  the  incorporation  of  adjustments  to  cope 
with  imperfect  characters.  It  is  comparable  to  the  earlier  scheme  as  far 


Letter  Scanning  for  Character  Recognition  23 1 

as  speed  is  concerned.  Some  difficulty  might  be  encountered  in  loading 
sheets  on  the  drum,  but  this  could  be  surmounted  with  a  minimum  of  ingenu¬ 
ity;  vacuum  holding  techniques  suggest  themselves.  In  any  case  it  is  clearly 
possible  for  these  scanners  to  scan  faster  than  sighted  readers  can  scan. 

The  photocell  output  of  this  type  of  scanner  lends  itself  to  any  type 
of  ‘logic’  or  buffering  likely  to  be  desired.  The  difficulties  of  reading  im¬ 
perfect  characters  would  fall  not  on  this  scanner  but  on  the  associated 
‘logic’  subsystem. 

Facsimile  transmitters  using  these  concepts  have  been  developed  quite 
far  in  the  course  of  many  year’s  service  in  news  photography  and  weather 
map  transmission.  Some  machines  use  the  ‘developed  drum  surface’  varia¬ 
tion  on  the  scheme:  the  paper  lies  flat  in  a  carriage  which  moves  linearly, 
while  the  light  or  aperture  moves  linearly  on  a  perpendicular  to  the  line 
of  carriage  motion.  These  may  be  easier  machines  on  which  to  mount 
paper,  but  tend  to  be  mechanically  a  little  more  complicated.  Otherwise 
they  are  equivalent. 

Moving  Cells.  The  third  mechanical  scheme  is  quite  different  from  the  first 
two.  Harmon  has  reported  a  working  model  illustrating  the  principles  of  a  rec¬ 
ognition  process  he  developed  and  has  made  studies  of  what  might  be  ex¬ 
pected  from  application  of  the  idea  to  character  reading  (13).  Not  only  is  the 
mechanization  different  in  his  device;  there  is  also  at  work  here  a  funda¬ 
mentally  different  concept  of  the  relation  between  scanning  and  recognition 
(‘logic’)  functions. 

Harmon’s  model  uses  32  photocells.  These  are  arranged  on  the  circum¬ 
ference  of  a  circle  of  variable  radius.  To  perform  a  scan  the  circle  is  ex¬ 
panded  from  minimum  to  maximum  radius,  each  photocell  moving  radially 
outward.  The  relative  times  when  the  cells  passed  through  the  lines  of  an 
exemplar  provide  sufficient  data  for  the  ‘logic’  circuits  to  classify  the  ex¬ 
emplar.  The  model  can  recognize  «-sided  figures,  where  n  ranges  from 
three  to  six,  circles,  and  no-figures.  (It  can  also  distinguish  and  count  up 
to  six  separate  shapes  within  the  exemplar  space,  with  some  limitation  on 
location,  shape,  and  size.) 

The  noteworthy  virtue  of  this  approach  is  that,  using  quite  simple 
recognition  ‘logic’  circuits  (about  50  vacuum  tubes),  Harmon  obtained  a 
machine  which  can  recognize  exemplars  independently  of  angular  orienta¬ 
tion,  size,  distortion  of  the  outline,  and  location  within  the  exemplar  space. 
There  are  limits,  in  practice,  to  the  imperfections  tolerable  under  the  last 
three  of  these  factors,  due  to  the  mechanical  construction  and  to  the  fixed 


232  Man-Machine  Systems 

size  of  the  exemplar  space,  of  course;  but  these  limitations  are  not 
inherent  in  the  concept  and  would  not  prevent  their  practical  employment. 

It  is  important  to  note  that  Harmon  did  not  derive  this  scanning  con¬ 
cept  from  considerations  of  mechanism  or  the  scanning  process  alone. 
Rather,  it  appears,  he  sought  a  transformation  which  would  produce  similar 
results  independently  of  the  orientation,  etc.,  of  the  exemplars.  Having 
found  one,  he  then  had  in  his  hand  clear  guides  both  as  to  how  the  scan¬ 
ning  should  be  done,  and  to  how  the  recognition  ‘logic’  should  be  con¬ 
structed. 

Though  we  have  spoken  and  we  think  of  scanning  and  logic  functions 
separately,  it  should  be  emphasized  that  this  is  an  artificial  division.  It  is 
a  division  which  serves  nicely  for  discussion  of  techniques,  but  it  can  be 
a  barrier  to  progress  in  actually  designing  systems  unless  we  remember 
that  it  is  a  division  for  convenience  and  that  any  design  can  take  advantage 
of  the  fact  that  the  two  are  closely  related.  Harmon’s  machine  is  a  clear 
demonstration  of  this  principle.  Though  it  is  not  certain,  it  would  appear 
that  ft-gon  recognizers  built  within  the  design  context  of  the  Farrington 
machine  would  be  much  more  elaborate  than  Harmon’s.  The  Farrington 
scanners  might  be  considered  less  specialized.  Their  output  can  be  handled 
by  a  variety  of  processes  simply  by  coupling  them  to  a  computer.  Harmon’s 
version,  it  might  be  argued,  is  more  specialized;  at  least  a  form  of  polar- 
to-rectangular  coordinate  conversion  might  be  required  to  make  its  output 
the  equivalent  of  the  others.  Nonetheless  Harmon’s  system  has  a  clear  over¬ 
all  advantage. 

Harmon’s  study  of  character  recognition  with  systems  of  this  sort  was 
made  in  terms  of  a  field  of  photocells  arranged  in  concentric  circles,  rather 
than  in  terms  of  mechanically  moving  cells  (i.e.,  32  rings  of  64  cells  each, 
dividing  the  exemplar  space  into  2048  cells  of  varying  size).  From  the 
viewpoint  of  the  whole  concept  the  two  types  are  equivalent.  In  the 
Harmon  device  the  cells  are  gated  ‘on’  by  successive  larger  radius  rings. 
For  design  and  construction  purposes,  which  system  would  be  best  would 
probably  depend  on  “designer’s  choice.” 

It  is  clear  that  a  mechanical  dilating  ring  scanner  could  handle  the 
three  types  of  characters.  This  is  a  fairly  simple  system,  especially  so  since 
imperfect  exemplars  can  be  handled  without  complicating  the  scanning. 
It  could  be  made  fast  enough  to  keep  up  with  low  reading  rates,  although 
exceeding  a  ten  character  per  second  reading  speed  would  require  a  dif¬ 
ferent  design.  (Many  alternative  realizations  of  the  idea  could  undoubtedly 
be  developed.)  The  photocell  output  could  be  used  to  drive  conversion 


Letter  Scanning  for  Character  Recognition  233 

and  buffering  circuits  as  required,  but  it  is  of  course  uniquely  well  suited  to 
drive  circuits  like  those  of  Harmon. 

Electrical  Scanning 

Electrical  scanning  rests  on  cathode  ray  tube  technique.  To  use  it  may 
require,  therefore,  more  money  and  more  sophistication  in  terms  of  tools 
than  mechanical  scanning.  If  a  project  required  a  nonstandard  (e.g.,  un¬ 
usually  high  resolution)  tube,  painstaking  effort  by  experimental  engineers 
and  highly  skilled  technicians  will  be  needed.  Yet  there  are  offsetting  ad¬ 
vantages.  For  one  thing,  the  cathode  ray  tube  art  is  now  so  well  developed 
that  one  might  buy  quite  cheaply  from  stock  some  suitable  tubes  for  a 
given  design.  In  speed,  of  course,  the  cathode  ray  tube  exceeds  by  far  all 
mechanical  schemes.  The  cathode  ray  tube  pattern  can  also  be  “steered” 
about  the  exemplar  space  if  necessary  without  adding  to  the  mechanical 
complexity  of  the  scanner.  (The  electrical  circuitry  involved  is  then  ex¬ 
panded.)  Maintenance  of  an  electrical  scanner  requires  technicians  with 
experience  in  electronic  equipment,  of  course,  but  since  the  equipment 
can  be  built  of  plug-in  modules,  actually  putting  a  defective  system  back 
into  service  goes  very  quickly  in  comparison  with  the  usual  mechanical 
system  repairs. 

Two  complementary  kinds  of  cathode  ray  tube  scanning  exist.  One, 
the  flying  spot,  involves  a  tube  like  that  of  the  cathode  ray  oscilloscope  or 
television  receiver:  a  bright  spot  travels  on  the  surface  of  the  tube  face. 
This  spot  is  used  with  suitable  optical  components  to  scan  the  exemplar 
space;  a  photocell,  observing  the  whole  zone,  produces  the  electrical  signal 
analogue  of  the  exemplar.  The  other  type  uses  light  from  the  whole  ex¬ 
emplar  space  falling  on  the  face  of  the  tube;  a  photosensitive  surface  inside 
is  scanned  by  an  internal  cathode  ray  beam,  whereby  essentially  the  same 
electrical  signal  analogue  of  the  exemplar  space  is  obtained.  This  is  the 
kinescope  or  image  orthicon  tube  used  in  most  television  cameras. 

The  two  following  sections  discuss  examples  of  these  kinds  of  scanners. 

Flying  Spot  Scanner.  Flying  spot  scanning  has  become  a  very  popular  tech¬ 
nique.  One  can  find  several  projects  using  such  scanners  with  very  little 
looking.  Among  others,  we  can  mention  the  British  Solartron  ERA  (8), 
Hannan’s  RCA  character  reader  ( 12),  the  work  of  Grimsdale’s  group  (11), 
Rabinow’s  designs  (20),  Stone’s  experiments  at  the  Air  Force  Rome  Air 
Development  Center  (25),  the  work  of  Wada’s  group  in  Japan  (28),  and 
Week’s  project  at  IBM  (29). 


234  Man-Machine  Systems 

The  Solartron  ERA  deserves  special  mention,  for  it  was  one  of  the 
first  to  be  offered  for  commercial  service  and  it  makes  use  of  representative 
flying  spot  scanning  applications.  The  scanning  pattern  is  a  raster,  a  group 
of  straight  parallel  lines,  spaced  one  resolution  unit  apart.  The  cathode 
ray  ‘paints’  the  raster  on  the  tube  face.  Lenses  project  the  pattern  on  the 
exemplar  space.  The  pattern  has  16  lines  and  each  line  is  considered  to 
be  made  up  of  28  contiguous  cells.  The  lines  are  oriented  to  cross  the 
exemplar  space  vertically. 

The  raster  occupies  only  a  part  of  the  face  of  the  cathode  ray  tube. 
The  ERA  design  examines  successive  characters  of  a  line  by  moving  the 
center  of  the  raster;  thus  no  mechanical  motions  are  required  except  to 
advance  the  paper  through  the  scanner  line  by  line.  Since  the  documents 
for  which  the  first  machines  were  built  were  cash  register  tapes  this  was 
relatively  simple.  The  photomultiplier  cell  viewing  the  exemplar  space 
generates  a  train  of  pulses  representing  the  black  and  white  areas  of  the 
exemplar. 

This  process  can  proceed  at  enormous  speeds.  The  ERA  rate  of  250 
characters  per  second  is  by  no  means  the  upper  limit;  there  is  no  reason 
to  drive  the  scanner  at  rates  above  those  of  the  recognition  logic  and 
paper-handling  gear.  If  no  such  limits  existed  the  rate  might  be  raised  to 
50  million  characters  per  second.  Commercially  available  oscilloscopes  can 
achieve  109  scans  per  second;  at  20  lines  per  character  the  indicated  rate 
would  be  obtained  easily.  No  need  for  such  enormous  rates  is  visible  now, 
but  if  a  centralized  system  were  to  be  designed  for  a  number  of  users  in  a 
library,  for  instance,  higher  rates  than  are  needed  in  commercial  data- 
transcribing  readers  might  conceivably  be  useful. 

The  ERA  was  designed  to  handle  digits  only,  and  a  few  special  char¬ 
acters,  in  a  fixed  font.  The  scanner,  however,  could  deliver  data  on  any 
of  the  three  types  of  characters.  The  scheme,  though  simple  in  concept, 
requires  several  electronic  circuits  to  make  it  operate.  It  is,  however,  in 
some  ways  less  complicated  than  a  television  receiver.  The  photocell  out¬ 
put  can  be  used  by  converters  or  buffers  as  required.  The  ERA  has  circuits 
to  reduce  the  effects  of  noise  (specks  of  dirt  and  holes  in  the  ink  pattern) 
and  also  uses  a  buffer  store  in  which  data  on  a  character  (generated 
sequentially)  can  be  held  until  all  had  been  obtained.  Imperfections  in 
characters  cause  no  problems  for  the  scanner;  the  recognition  ‘logic’  has 
to  cope  with  those  difficulties. 

An  early  Rabinow  concept  for  a  machine  using  a  flying  spot  scanner 
involved  an  entirely  different  conception  of  recognition  logic.  The  light 
from  the  flying  spot  scan  of  the  exemplar  space  was  projected  onto  an 


Letter  Scanning  for  Character  Recognition  235 

aperture  made  the  shape  of  the  character  to  be  recognized.  A  disc  brought 
the  shapes  of  the  font  being  handled  into  position  one  after  the  other.  (For 
maximum  speed,  he  proposed  the  use  of  multiple  light  paths,  one  for  each 
of  the  characters  to  be  recognized.)  A  photocell  observing  the  aperture 
produced  data  from  which  the  recognition  circuits  could  function.  This 
scheme,  like  that  of  Harmon,  used  closely  integrated  scan  and  logic  com¬ 
ponents  to  advantage.  Rabinow  thought  1000  characters  per  second  could 
be  handled  in  this  way.  The  scheme  would  be  restricted  to  fixed  type  fonts 
and  would  be  vulnerable  to  imperfections  of  characters,  though  further 
system  complexities  could  be  introduced  to  offset  some  of  these  difficulties. 

Grimsdale’s  and  Wada’s  projects  both  have  interesting  recognition 
logic  schemes,  but  to  review  the  scanning  aspects  would  add  no  important 
new  developments  to  this  discussion. 

Hannan  of  RCA  has  just  recently  discussed  a  character  reader  for 
language  translation  purposes  which  uses  a  flying  spot  scanner.  It  has 
control  circuits  to  locate  lines  of  text  and  to  center  characters  both  horizon¬ 
tally  and  vertically  in  the  exemplar  space.  Thus  it  provides  the  protection 
against  mislocation  required  by  the  system’s  recognition  logic  process. 
Reading  rates  of  500  characters  per  second  are  claimed.  It  is  meant  to 
handle  fixed  but  changeable  type  fonts. 

Weeks,  while  at  IBM,  reported  a  study  of  character  recognition  which 
used  a  flying  spot  scanner.  Here,  instead  of  one  raster  scan  on  the  ex¬ 
emplar  space,  six  are  used.  That  is,  the  space  was  scanned  64  times  along 
parallel  lines  with  one  orientation;  this  was  then  repeated  five  more  times, 
each  time  with  the  lines  at  a  different  angle  from  the  base  line.  Only  6  of 
the  64  lines  were  actually  used  in  the  recognition  logic;  they  were  chosen 
from  the  64  by  the  control  circuitry  according  to  criteria  established  by 
the  designers.  The  significant  part  of  this  system  for  us  is  that  only  the 
speed  of  the  cathode  ray  tube  would  permit  so  extensive  an  investigation 
of  the  exemplar  space.  The  system  read  correctly  90  percent  of  a  sample 
of  100  hand-printed  numerals  (rather  carefully  produced);  it  read,  100 
percent  correctly,  a  sample  of  100  machine-printed  digits. 

At  the  Air  Force  Rome  Air  Development  Center,  Stone  built  a  proto¬ 
type  recognition  system  for  alphabetic  characters  in  1955  which  used  a 
flying  spot  scanner.  This  is  notable  because  the  scan  pattern  employed 
was  not  a  raster  but  was  ‘isotropic.’  This  scan  examined  every  cell, 
down  to  minimum  resolution  size,  over  every  point  with  mutually  or¬ 
thogonal  sweeps  in  both  directions.  Both  Stone  and  Kovasznay  (15,  25) 
used  this  system  to  obtain  a  clearer  and  ‘harder’  pattern  from  the  orignal 
in  the  exemplar  space.  Stone  produced  this  improved  pattern  on  the  face 


236  Man-Machine  Systems 

of  a  cathode  ray  tube  using  sweeps  synchronized  with  those  of  the  flying 
spot  tube.  From  that  point  in  the  design  photocells  and  logic  circuits  carried 
out  the  recognition  of  the  exemplar.  Stone’s  design  had  automatic  line 
finding  (up-down  registration)  and  character  finding  (left-right  registra¬ 
tion)  circuits.  The  paper  text  to  be  read  did  not  move;  the  scanner  pat¬ 
terns  moved  over  the  page  instead.  Characters  had  to  be  on  “reasonably 
regular”  base  lines  and  of  more  or  less  the  same  size.  The  results  reported 
are  not  specific  as  to  accuracy  or  limitations  on  distortion,  but  it  appears 
the  system  could  handle  manually  printed  block  capital  letters  and  also 
fixed  type  font  letters. 

Photosensitive  Cathode  Ray  Tube  Scanning.  The  systems  using  image 
orthicon  and  other  television  camera  cathode  ray  tubes  are  the  same, 
in  principle,  as  those  of  the  preceding  systems.  Here  the  flying  spot 
scanning  tube  and  the  photocells  are  in  effect  combined  within  a 
single  cathode  ray  tube  envelope.  The  tube  cost  tends  to  be  higher, 
but  the  technique  has  nonetheless  attracted  some  workers.  J.  K.  Taylor 
used  it,  in  England  (26),  and  a  team  at  Control  Instruments,  a  divi¬ 
sion  of  Burroughs  Corporation,  used  it  for  a  U.  S.  Army  Signal  Corps 
type  page  reader  project  (27).  The  Control  Instruments  project  used  a 
conventional  raster  scan  of  six  lines  to  examine  the  image  of  the  exemplar 
space,  which  was  optically  placed  on  the  sensitive  surface  of  an  image 
orthicon.  Signals  due  to  noise  specks  were  filtered  out  electrically.  A  series- 
to-parallel  conversion  in  tapped  delay  lines  allowed  the  sequentially  derived 
data  to  be  applied  simultaneously  to  the  recognition  logic  circuits.  Servo¬ 
positioning  of  the  paper,  plus  rescanning  of  the  exemplar  space,  were  used 
to  offset  mislocations,  but  the  system  was  restricted  to  a  fixed  type  font 
(although  capitals,  lower  case  letters,  numerals,  and  punctuation  marks 
were  all  included)  and  to  angular  misorientation  less  than  ten  degrees. 
Its  recognition  logic  has  some  resistance  to  imperfection  in  the  exemplars. 
It  will  handle  75  characters  per  second  (of  double  spaced  elite  type). 

One  of  W.  K.  Taylor’s  projects  used  a  television  camera  tube  as  an 
input  scanner.  A  serial-to-parallel  conversion  is  obtained  by  gating  the 
tube  signals  into  storage  capacitors.  The  recognition  logic  is  analogue 
rather  than  digital  and  is  much  less  elaborate  than  that  of  many  other 
systems.  It  is  well  worth  study. 

Variations  on  the  Sequential  Scanning  Scheme 

Combinations  of  electrical  and  mechanical  scanning  techniques  are  pos¬ 
sible,  and  many  variations  on  the  schemes  outlined  above  could  be  found. 


Letter  Scanning  for  Character  Recognition  237 

One  worthy  of  note  was  part  of  Zworykin’s  work  on  reading  machines  for 
the  blind.  He  used  a  flying  spot  scanner  to  create  eight  bright  spots  in  a 
row,  the  row  being  oriented  perpendicularly  to  the  base  line  of  the  charac¬ 
ters  being  read.  A  photocell  observed  the  exemplar  space.  As  the  bright 
spots  were  moved  from  left  to  right  by  moving  the  cathode  ray  tube  mount 
mechanically,  signals  were  generated  which  could  be  used  to  recognize 
the  characters  being  scanned.  This  simple  device  was  claimed  to  be  en¬ 
tirely  adequate  for  use  to  control  recordings  which  pronounced  the  letter 
as  they  were  scanned.  Forty  words  per  minute  was  the  highest  comfortable 
speed,  but  the  scanner  and  its  logic  could  manage  200  words  per  minute 
if  there  were  any  way  for  the  output  to  keep  up  with  this  rate  (30). 

As  some  of  the  designs  mentioned  have  shown,  it  is  possible  to  com¬ 
bine  parallel  operation  with  the  scanning  scheme.  That  is,  one  can  use 
three  flying  apertures  and  three  photocells  to  scan  the  exemplar  space 
simultaneously,  as  did  one  Farrington  machine  design  (1).  National  Cash 
Register  Co.  has  announced  a  reader  for  cash  register  tapes  which  uses 
four  photocells  and  a  drum  with  apertures  to  scan  ten  zones  of  the  exemplar 
space  (17). 

Another  RCA  project  not  yet  mentioned  has  also  been  discussed  in 
terms  of  commercial  data  handling  applications.  It  uses  a  vertical  line  of 
seven  photocells  in  a  carriage  which  moves  along  a  line  of  characters  on 
a  page  mounted  on  a  rotating  drum.  This  machine  is  able  to  read  at  the 
rate  of  1000  characters  per  second.  Its  paper  handling  equipment  pro¬ 
cesses  12  pages  per  minute  of  20  lines,  with  100  characters  per  line.  A 
special  5 1 -character  alphanumeric  type  font  is  used  (18). 

PARALLEL  SENSING  OF  THE  EXEMPLAR  SPACE 

In  the  schemes  discussed  above,  the  elements  of  the  exemplar  space  were 
examined  by  scanning  them  sequentially.  Here,  we  shall  consider  schemes 
in  which  the  elements  are  sensed  simultaneously.  One  group  of  projects 
uses  parallel  sensing  by  providing  one  photocell  per  element  in  the  ex¬ 
emplar  space.  Another  group  treats  the  entire  exemplar  space  as  an  entity; 
it  seems  proper  to  refer  to  these  as  Gestalt  schemes. 

Photocell  A  rrays 

The  idea  of  an  array  of  photocells  on  which  the  exemplar  space  is  placed 
optically  has  been  central  to  many  pattern  recognition  projects.  This  is 
especially  true  of  research  projects  where  no  actual  scanning  is  done,  but 
the  entire  system  design  and  operation  is  simulated  on  a  digital  computer. 
This  is  usually  justified  (and  correctly  so)  by  saying  that  whatever  scanner 


238  Man-Machine  Systems 

proves  practical  can  produce  data  in  the  same  form  as  if  any  array  of 
photocells  were  being  used.  The  drawback  of  this  point  of  view  is  that  it 
tends  to  lead  to  separation  of  scanner  and  recognition  processes,  and  as 
we  have  seen  and  will  see,  this  dichotomy  hides  from  view  a  number  of 
very  interesting  approaches. 

Taylor,  whose  television  camera  scanning  scheme  was  discussed  above, 
has  also  developed  designs  based  on  a  9  by  9  array  of  photocells  (26).  A 
demonstration  model  which  was  displayed  at  a  conference  in  England  in 
1958  occupied  less  than  five  cubic  feet  and  could  handle  typewritten  char¬ 
acters.  Taylor  felt  that  the  equipment  could  recognize  imperfect  type  and 
handwritten  characters,  although  no  performance  data  are  available  (these 
may  appear  in  a  forthcoming  monograph) . 

The  Perceptrons  of  Rosenblatt  and  other  investigators  afford  per¬ 
haps  the  biggest  group  of  photocell  array  sensing  schemes.  Since  the 
array  is  not  important  to  these  investigators  (their  interest  is  primarily  in 
the  “learning”  processes  whereby  the  Perceptrons  acquire  recognition  and 
logic  capabilities),  these  designs  do  not  involve  any  new  points  of  in¬ 
terest  so  far  as  the  sensing  process  is  concerned.  These  projects  have  defi¬ 
nitely  shown  experimentally,  however,  that  arrayed  cells  can  accomplish 
sensing  in  recognition  systems  (5,  6,  16,  23,24). 

One  unusual  sensing  scheme  developed  in  connection  with  Perceptron 
work,  however,  has  been  discussed  by  Cameron  of  the  Armour  Research 
Foundation  (3).  A  light-sensitive  polychromic  film  is  used  as  the  adaptive 
filter  which,  interposed  between  photocells  and  exemplar  space,  affords 
recognition  in  Perceptron  form  (accounts  of  this  scheme  so  far  published 
have  been  news  releases  only) . 

Rabinow,  in  one  project,  has  an  array  of  photocells  in  use  for  sensing 
purposes  (21). 

Some  photocell  array  schemes  do  not  use  equally  spaced  rectangular 
arrangements  of  photocells.  Diamond  has  demonstrated  how  hand-made 
block  numerals  can  be  recognized  with  a  configuration  of  photocells  if  the 
numbers  are  written  properly  with  respect  to  two  reference  dots.  He 
has  also  discussed  the  extension  of  the  idea  to  the  handling  of  alphabetic 
characters.  The  startling  thing  about  these  designs  is  their  simplicity; 
one  version  needed  just  seven  relays  for  all  scanning,  recognition  logic, 
and  output  functions  (7). 

Stone,  whose  work  was  discussed  above,  used  photocells  not  for  the 
original  scanning  of  the  exemplar  space  but,  operating  from  an  improved 
image  generated  by  the  scanner,  as  recognition  logic  elements.  Though 


Letter  Scanning  for  Character  Recognition  239 

not  part  of  our  present  discussion,  we  have  here  an  illustration  of  the  use 
of  a  nonregular  arrangement  of  photocells  to  sense  an  optical  pattern  (25). 

Kovasznay  and  Joseph  have  also  described,  and  perhaps  even  experi¬ 
mented  with,  a  set  of  photocells  for  sensing  purposes  ( 15 ) . 

Gestalt  Schemes 

Two  investigations  of  Gestalt  schemes  have  been  reported:  one  by 
Fitzmaurice,  Sabbagh  and  Elliott  of  Baird- Atomic  Co.  (10);  and  one  by 
L.  R.  Brown  of  Briggs  Associates  (4),  now  a  part  of  Drexel  Dynamics 
Co.  For  sheer  imaginative  ingenuity  these  two  projects  are  outstanding.  Only 
Taylor  has  reported  results  comparable  to  those  which  appear  to  be  pos¬ 
sible  with  the  approach  so  far  as  simplicity  of  equipment  and  sheer  effec¬ 
tiveness  are  concerned  (26). 

The  two  projects,  though  similar  in  concept,  differ  in  detail.  The  Baird- 
Atomic  project  has  not  been  discussed  in  the  generally  available  literature; 
Brown’s  project  has.  We  will  look  first  at  the  latter  and  not  the  similarities 
and  differences  with  the  scheme  of  Fitzmaurice,  et  al. 

Imagine  n2  lenses  (in  a  system  to  recognize  n2  characters)  arranged 
in  an  n  by  n  array.  Let  parallel  rays  of  light  fall  on  the  array.  There  is  a 
mask  in  front  of  each  lens.  Each  mask  has  a  hole  in  it  at  one  position, 
which  is  the  same  position,  relative  to  its  associated  lens,  for  each  mask. 
If  the  lenses  are  spherical  lenses  in  a  lenticular  array  (as  in  Brown’s  de¬ 
sign),  every  lens  will  pass  the  light  coming  through  its  mask  to  one  com¬ 
mon  point  in  a  viewing  plane.  If  another  set  of  holes  is  opened  in  the  masks 
the  lenses  will  pass  light  through  the  new  holes  to  one  new  common  point 
in  the  viewing  plane.  There  are  positions  for  n  2  holes  in  the  mask  for  each 
of  the  n  2  lenses.  One  mask  position  is  assigned  to  each  of  the  n 2  characters. 
Parallel  light  rays  coming  from  exemplar  space  fall  on  the  masks.  Suppose 
the  character  to  be  dark  in  a  light  field.  Some  of  the  lens-mask  pairs  will 
be  in  darkness;  some  will  be  illuminated.  Those  which  are  dark  have  open 
holes  at  the  position  assigned  to  the  character.  Consequently,  essentially 
no  light  falls  on  the  photocell  to  which  all  the  lenses  send  the  light  coming 
through  these  holes.  At  other  positions,  where  light  falls  on  the  masks,  open 
holes  pass  light  to  the  other  cells.  Minimum  photocell  output  identifies 
the  character  in  the  exemplar  space. 

Actually,  Brown’s  design  uses  2 n 2  photocells  and  2 n  2  holes  in  each  of 
the  n2  masks.  The  extra  set  of  holes  and  cells  provides  for  the  logical 
inverse  condition.  Holes  are  opened  in  the  masks  where  the  character  is  not. 
Differential  output  from  the  two  photocells  associated  with  a  single  char- 


240  Man-Machine  Systems 

acter  permits  the  system  to  reject  characters  such  as  an  E  with  a  gap  in 
its  bottom  line,  which  might  be  an  F;  or  an  F  with  a  smudge,  which  might 
be  an  E.  The  thresholds  for  recognition  and  rejection  are  adjustable. 

This  scheme,  it  is  felt,  can  handle  type  fonts  of  some  diversity  and 
not  just  a  single  font.  Variations  in  contrast  cause  no  difficulty  and  the 
system  is  resistant  to  smudged  and  incomplete  characters.  Misregistration 
in  the  left-right  direction  causes  no  problem:  the  photocell  output  will 
peak  at  some  point  as  the  character  passes  left-to-right  through  the  ex¬ 
emplar  space.  Up-down  misregistration  can  be  compensated  for  by  me¬ 
chanical  wobbling  of  the  mask-and-lense  assembly. 

One  of  the  attractive  aspects  of  this  scheme  is  that  creation  of  the 
masks  is  a  simple,  if  tedious,  exercise.  A  piece  of  film  is  placed  where  the  n  2 
masks  will  be  mounted.  A  light  bulb  is  placed  where  the  photocell  for  a 
character  will  be.  A  transparency  (dark  numeral,  light  field)  of  the  char¬ 
acter  is  put  into  position  in  register  with  the  lenticular  array.  The  bulb 
is  turned  on,  exposing  the  film  through  the  lenticular  lens  array.  Thus  a 
dark  spot  is  exposed  on  the  film  where  the  masks  require  no  holes,  and 
conversely.  This  is  repeated  for  all  characters  (and  for  the  logical  inver¬ 
sions).  The  film,  of  course,  becomes  the  entire  set  of  masks.  Since  the 
mask  is  made  in  the  optical  system  with  which  it  will  be  used  no  high- 
precision  operations  are  required. 

The  Baird- Atomic  project,  though  similar,  is  even  more  imaginative. 
To  recognize  n  characters,  n  +  1  photocells  suffice,  and  no  lenses  are 
needed.  The  design  derives  from  the  observation  that  the  mathematics 
of  a  two-dimensional  optical  system  of  a  defocused  lens  with  an  aperture 
stop  in  the  lens  plane  is  similar  to  that  of  a  one-dimensional  electric  filter 
circuit.  N  apertures  are  made  by  photographing  the  characters  to  be  rec¬ 
ognized.  The  n  -f-  1  aperture  is  the  complement  of  all  the  characters;  it 
is  made  by  printing  all  the  characters  at  one  position,  one  after  another, 
on  top  of  one  another,  without  moving  the  paper.  N  -(-  1  photocells  and 
an  amplitude  comparison  system  effect  recognition. 

CONTOUR  TRACING 

Preceding  sections  have  dealt  with  sensing  systems  based  on  sequential 
scanning  of  the  exemplar  space  and  on  parallel  path  sensing  of  the  space. 
In  all  of  these  cases  the  character  pattern  and  its  surround  or  background 
were  of  nearly  equal  importance.  Here  we  will  consider  some  schemes 
which  largely  ignore  the  surround;  only  the  trace  of  the  character  itself 
is  of  primary  significance. 

Kovasznay  and  Joseph  included  an  investigation  of  contour  tracing 


Letter  Scanning  for  Character  Recognition  24 1 

systems  in  their  work  (15).  They  used  a  flying  spot  cathode  ray  tube 
scanner  which  had  control  circuitry  for  locating  objects  in  the  field  and 
other  circuitry  for  driving  the  spot  around  the  perimeter  of  a  pattern  once 
located.  They  had  a  process  for  recognition  which  operated  from  the 
scanning  voltages’  amplitude-versus-time  relationship. 

In  Europe  investigations  of  contour  scanning  have  apparently  been 
conducted,  but  the  relevant  literature  is  difficult  to  obtain.  At  the  Kiev 
Computing  Center  in  the  U.  S.  S.  R.,  there  has  been  an  investigation  of 
a  contour  tracing  system.  The  following  account  comes  from  a  report  writ¬ 
ten  by  an  American  visitor  to  the  Center: 

“The  essential  principle  is  elegantly  simple.  A  cathode  ray  beam  provides  a 
light  spot  which  is  focussed  through  a  lens  onto  the  scanned  letter.  The  light 
reflected  from  the  letter  is  collected  by  a  photomultiplier  tube.  The  collected 
signal  gives  light-dark  information.  The  beam  moves  with  unit  steps  in  small 
squares,  clockwise  in  a  white  field  and  counterclockwise  in  a  black  field  .  .  . 

“To  prevent  cycling  of  the  beam  in  an  all-white  or  all-black  area,  the  tracking 
mechanism  adjusts  itself  to  move  two  units  at  a  time  when  two  successive  unit 
moves  have  produced  no  change  from  white  to  black  or  black  to  white.  This 
simple  mechanism  will  result  in  a  rough  tracking  of  the  edges  of  a  letter. 

“At  each  point  that  the  beam  changes  direction,  a  new  average  direction  is 
computed  and  coded.  In  this  coding,  only  eight  directions  are  recognized:  0°, 
45°,  90°,  135°,  180°,  225°,  270°,  and  315°.  A  recognition  system  based  on  this 
coding  is,  therefore,  insensitive  to  slight  rotational  changes  in  the  letters.  Large 
rotations,  however,  cannot  be  tolerated.  It  was  mentioned  that  this  lack  of 
discrimination  is  not  a  source  of  concern  because  ‘rotation  of  letters  is  never 
an  allowable  operation  in  ordinary  printing  and  reading’  ”  (9). 

The  work  had  not  yet  progressed  to  design  of  the  recognition  process,  but 
the  contour  tracing  apparently  was  functioning  and  producing  data  from 
real  characters  which  were  being  studied  in  the  design  of  the  recognition 
logic. 

Brouillette  and  Johnson  of  the  General  Electric  Co.  have  also  worked 
on  a  contour  tracing  scheme  which  uses  a  flying  spot  scanner  to  put  a  small 
circle  of  light  down  on  the  line  being  followed.*  A  pulse  occurs  in 
the  phototube  output  every  time  the  flying  spot  crosses  the  line  while 
making  its  circle.  (Negative  images  of  the  pattern  were  used:  transparent 
outlines  on  opaque  background.)  The  times  at  which  the  pulses  occur  indi¬ 
cate  the  distance  and  direction  from  the  center  of  the  circle  to  the  line. 
Closed  loop  control  circuits  keep  the  circle  on  the  line. 

This  type  of  sensing  has  been  proven  workable,  but  its  successful  use 


*  Reported  on  by  the  staff  of  Automatic  Control,  August  1960,  pp.  16-17. 


242  Man-Machine  Systems 

for  character  recognition  will  have  to  wait  for  the  design  of  effective  rec¬ 
ognition  logic.  One  might  venture  the  opinion  that  this  will  be  a  long  wait. 
In  any  case,  those  working  on  these  three  or  similar  projects  need  not  feel 
hurt  by  this  opinion.  The  approach  has  been  shown  to  work,  and  these 
projects  were  directed  primarily  toward  patterns  other  than  characters. 
The  workers  in  Kiev  may  also  draw  comfort  from  the  fact  that  whenever 
I  go  on  record  with  such  opinions,  it  has  usually  turned  out  that  I  was 
wrong! 


CONCLUSIONS 

In  summing  our  feelings  about  the  sensing  process  and  the  alternative 
schemes  for  realizing  the  process  in  working  equipment,  there  is  one 
main  aspect  of  that  process  which  stands  out  as  critically  important.  Above 
all  else  it  is  the  over-all  character  reading  system  design  which  determines 
the  sensing  scheme  to  be  used.  To  consider  the  sensing  operation  separate 
from  the  recognition  logic  operation  may  well  be  useful  in  theoretical  studies 
of  the  latter.  But  in  working  equipment  the  laurels  for  simplicity  and  low 
cost  go  to  those  who  view  the  system  as  an  integrated  whole.  This  may 
seem  obvious  to  mention,  but  note  that  quite  a  lot  of  effort  has  been  spent 
on  studies  which  ignored  this  point;  and  that  a  substantial  quantity  of  com¬ 
mercial  equipment  has  been  built  which  makes  little  use  of  the  potentiali¬ 
ties  of  an  awareness  of  this  interdependence. 

The  character  reading  system  design  is,  in  turn,  most  heavily  con¬ 
ditioned  by  whether  the  system  is  to  read  fixed  type  font  or  handwritten 
characters.  If  printed  characters  made  by  fixed  type  fonts  are  to  be  read, 
even  if  more  than  one  type  font  may  be  encountered,  Fitzmaurice’s  design 
is  the  standard  for  comparison.  In  spite  of  the  brilliance,  ingenuity,  and 
hard  work  that  have  been  put  into  other  schemes,  this  method  has  such 
compelling  conceptual  advantages  that  it  is  the  first  choice.  Brown’s  de¬ 
sign  is  a  relatively  close  competitor,  and  there  is  a  possibility  that  the  ap¬ 
proach  taken  by  Taylor  may  be  worth  considering. 

If  hand-printed  block  characters  or  handwritten  cursive  characters 
are  to  be  read,  however,  the  flying  spot  cathode  ray  tube  scanner  becomes 
the  standard  for  comparison.  The  flying  aperture  schemes  have  the  ad¬ 
vantages  noted  with  respect  to  the  matter  of  financing.  This  is  not,  in  the 
last  weighing,  a  sufficient  advantage.  Flying  spot  cathode  ray  tube  scanners 
can  be  made  intriguing,  too.  And  while  in  some  individual  cases  we  may 
find  designers  who  favor  mechanical  techniques  for  personal  reasons  (fa¬ 
cility,  availability  of  tools)  we  must  note  that  for  investigators  equally 


Letter  Scanning  for  Character  Recognition  243 

skilled  in  both  techniques  the  cathode  ray  tube  approach  is  preferable. 
Necessary  parts  are  available  from  stock  at  very  modest  prices;  and  the 
associated  circuitry  that  would  be  needed  can  be  drawn  directly  from 
cathode  ray  tube  experience.  A  laboratory  oscilloscope  provides  most  of 
what  would  be  needed  for  preliminary  experiments. 

Photocell  array  schemes  are  probably  closer  competitors  of  flying  spot 
tube  scanners.  They  have  attractive  advantages  of  simplicity  of  concept 
and  of  circuitry.  But  in  the  end  the  determining  weight  is  on  the  side 
of  the  flexibility  of  the  cathode  ray  tube.  With  it  we  can  alter  scans  both 
in  pattern  and  in  orientation,  search  for  exemplars  off  base  line,  scan  letters 
in  lines  and  even  lines  in  text  without  moving  paper,  and  so  on.  A  prefer¬ 
ence  for  flying  spot  tube  and  photocell  over  image  orthicon  or  other  camera 
tubes  is  due  primarily  to  the  relative  costs  and  simplicity  of  the  two. 

We  can  conclude  with  the  observation  that  the  sensing  operation  will 
not  set  limitations  on  the  system  designer,  regardless  of  his  line  of  ap¬ 
proach.  Whether  for  speed,  resolution,  equipment  bulk,  cost,  reliability, 
suitability  for  use  with  recognition  equipment,  or  whatever,  the  designer’s 
difficulties  need  not  lie  in  the  sensing  operation;  in  fact,  he  can  elaborate 
the  sensing  function  to  simplify  other  problems. 

REFERENCES 

1.  “Alphanumeric  Reader  Due  in  Commercial  Versions,”  Elec.  Design,  Vol.  8 

(11  May  1960),  pp.  26-29. 

2.  Bauldreay,  J.,  and  E.  Milbradt,  “Solving  Registration  Problems  in  Optical  Char¬ 

acter  Recognition,”  Electronics,  Vol.  35  (5  January  1962),  pp.  77-82. 

3.  “Brain  Models  and  Neural  Nets,”  Electronics,  Vol.  35  (2  March  1962),  p.  42. 

4.  Brown  L.  R.,  “Nonscanning  Character  Reader  Uses  Coded  Wafer,”  Electronics, 

Vol.  33  (25  November  1960),  pp.  115-117. 

5.  Bushor,  W.  E.,  “The  Perceptron — Experiment  in  Learning,”  Electronics,  Vol.  33 

(22  July  1960),  pp.  56-59. 

6.  Clark,  W.,  and  B.  Farley,  “Generalization  of  Pattern  Recognition  in  a  Self- 

Organizing  System,”  in  Proc.  Western  Joint  Computer  Conference  (AIEE- 
IRE-ACM),  1955,  pp.  86-91. 

7.  Dimond,  T.  L.,  “Devices  for  Reading  Hand-Written  Characters,”  in  Proc.  Eastern 

Joint  Computer  Conference  (AIEE-IRE-ACM),  1957,  pp.  232-237. 

8.  ERA.  U.K.:  Solartron  Electronic  Group,  Ltd.,  (undated),  14  pp. 

9.  Feigenbaum,  S.,  “Report  on  a  Visit  to  the  USSR,”  Communications  of  the  ACM, 

December  1961,  p.  573. 

10.  Fitzmaurice,  J.  A.,  E.  Sabbagh,  and  W.  Elliott  in  Fischer,  Pollack,  Radack,  and 

Stevens  (eds.)  Optical  Character  Recognition.  Washington,  D.  C.:  Spartan 
Books,  1962. 

11.  Grimsdale,  R.  L.,  F.  H.  Sumner,  C.  J.  Tunis,  and  T.  Kilburn,  “A  System  for  the 

Automatic  Recognition  of  Patterns,”  Proc.  I.E.E.,  Vol.  106B,  No.  26  (1959), 

p.  210. 


244  Man-Machine  Systems 

12.  Hannan,  W.  T.,  “Character  Reading  Machine  for  Language  Translation  Project,” 

Elec.  Design,  Vol.  10  (1  February  1962),  p.  12. 

13.  Harmon,  L.,  “A  Line-Drawing  Pattern  Recognizer,”  Electronics,  Vol.  33  (2  Sep¬ 

tember  1960),  pp.  39-43. 

14.  Kirsch,  R.  A.,  L.  Cahn,  L.  C.  Ray,  and  G.  H.  Urban,  “Experiments  in  Processing 

Pictorial  Information  with  a  Digital  Computer,”  in  Proc.  Eastern  Joint  Com¬ 
puter  Conference  (IRE-AIEE-ACM),  1957. 

15.  Kovasznay,  L.,  and  H.  M.  Joseph,  “Image  Processing,”  Proc.  I.R.E.,  (May  1955), 

pp.  560-570. 

16.  Mattson,  R.  L.,  “Self-Organizing  Systems,”  in  Proc.  Eastern  Joint  Computer  Con¬ 

ference  ( AIEE-I RE-ACM ) ,  1959,  pp.  212-217. 

17.  “Optical  Reader  Fast  Accurate,”  Elec.  Design,  Vol.  10  (1  February  1962),  p.  13. 

18.  “Optical  Reader  Uses  35-Point  Photocell  Matrix,”  Elec.  Design,  Vol.  9  (18 

January  1961),  pp.  8-9. 

19.  “Photo  Analyzer  Developed,”  Elec.  News,  (26  March  1962),  p.  98. 

20.  Rabinow,  J.  Experimental  Program  Testing  Techniques  for  a  Working  Character 

Reading  Machine.  ASTIA  Reference  AD- 1282 14. 

21.  Rabinow,  J.,  “Developments  in  Character  Recognition  Machines  at  Rabinow 

Engineering  Company,”  in  Fischer,  Pollack,  Radack,  and  Stevens  (eds.) 
Optical  Character  Recognition.  Washington,  D.  C.:  Spartan  Books,  1962. 

22.  “Reader  Will  Scan  10,000  Envelopes  per  Hour,”  Elec.  Design,  Vol.  8  (12 

October  1960),  pp.  14-19. 

23.  Roberts,  L.,  “Pattern  Recognition  with  an  Adaptive  Network,”  IRE  National 

Convention  Record,  Part  2,  March  1960,  pp.  66-70. 

24.  Rosenblatt,  F.  Principles  of  Neurodynamics.  Washington,  D.  C.:  Spartan  Books, 

1962. 

25.  Stone,  W.  P.  Test  of  Techniques  for  Working  Character  Reading  Machine. 

Rome  Air  Development  Center  (USAF)  Technical  Note  56-194,  June  1956, 
13  pp.  (ASTIA  AD-103237). 

26.  Taylor,  W.  K.,  “Automatic  Control  by  Visual  Signals,”  in  Proc.  Symposium  on 

Mechanisation  of  Thought  Processes.  U.K.:  National  Physical  Laboratories, 
1958,  pp.  841-861  (also  note  opposite  pp.  951-952). 

27.  Typed-Page  Reader.  Control  Instrument  Co.,  Burroughs  Corp.,  Reports  to  U.  S. 

Army:  ASTIA  AD  115097,  AD  219075,  AD  207514,  AD  209011,  1957-1958. 

28.  Wada,  H.,  “Character  Reading  Machine,”  in  Proc.  International  Conference  on 

Information  Processing.  Paris:  UNESCO,  June  1959. 

29.  Weeks,  R.  W.,  “Rotating  Raster  Character  Recognition  System,”  Communica¬ 

tions  and  Electronics  (AIEE),  September  1961,  pp.  353-359. 

30.  Zworykin,  V.,  L.  Flory,  and  W.  Pike,  “Letter-Reading  Machine,”  Electronics, 

Vol.  22  (June  1949),  pp.  80-86. 


SOME  OF  THE  LOGIC  PRESENTLY  USED  IN 


CHARACTER  RECOGNITION  MACHINES* 

JACOB  RABINOW 

Rabinow  Engineering  Company,  Inc.,  Rockville,  Maryland 


I  was  asked  to  speak  in  this  panel  about  the  logic  presently  used  in  char¬ 
acter  recognition  machines.  I  shall  try  to  cover  the  field,  not  only  of  ma¬ 
chines  built  by  Rabinow  Engineering  Company,  but  by  our  honored  com¬ 
petitors  as  well. 

Before  I  begin,  I  would  like  to  make  some  general  comments  about 
some  of  the  papers  I  have  heard.  There  was  enough  philosophizing  and 
generalizations  made  in  them  to  last  me  a  lifetime.  I  particularly  objected  to 
the  gratuitous  insult  of  saying  that  people  who  build  reading  machines 
should  devote  time  to  complete  system  analysis  and  to  an  over-all  look  at 
what  they  are  doing.  I  can  assure  you,  that  people  who  spend  so  much  time 
in  designing  actual  hardware  do  look  at  over-all  systems,  do  consider  the 
over-all  point  of  view,  the  purposes,  the  philosophy,  and  the  ethics  of  their 
machines;  and  that  they  need  not  be  told  that  there  is  an  over- all  point 
of  view  apart  from  their  hardware.  What  exactly  do  we  do  with  our  lunch 
time  and  our  evenings? 

If  but  one  of  these  authors  had  taken  the  trouble  to  look  into  this  mat¬ 
ter  before  he  spoke,  he  would  have  found  that  many  of  his  suggestions  and 
criticisms  have  been  answered,  for  some  of  them  are  now  ancient  history. 

In  order  to  describe  machines  which  may  be  particularly  useful  to  the 
blind,  I’d  like  to  describe  machines  in  general  and  then  make  some  remarks 
specifically  about  machines  for  the  blind. 

The  first  type  of  machine  I  would  like  to  describe  is  the  correlation 
machine,  such  as  is  being  built  by  my  company,  by  Philco,  by  IBM,  and 
others.  It  should  be  understood,  of  course,  that  these  machines  are  not  all 
of  the  same  type — that  they  vary  in  detail.  In  some  cases,  the  name  “cor¬ 
relation  machine”  may  be  a  misnomer.  A  particular  machine  may  use  one 
or  more  features  of  other  types  as  well. 

*  This  paper  has  been  reconstructed  by  Mr.  Rabinow  from  the  recording  of  his 
extempore  presentation — Ed. 


245 


246  Man-Machine  Systems 

Suppose  one  assumes  that  a  character  on  a  piece  of  paper  is  scanned 
by  some  suitable  means,  such  as  a  flying  spot  cathode  ray  scanner  that 
illuminates  one  point  of  the  paper  at  a  time,  or  by  a  vidicon  tube  as  used 
in  television,  or  a  Nipkow  disk  (which  is  an  ancient  form  of  the  same 
sort  of  thing:  it  examines  one  point  at  a  time,  line  by  line),  or  by  a  full  retina 
of  photocells  which  looks  at  a  character  area  all  at  one  time,  in  parallel,  as 
does  the  retina  of  the  human  eye.  The  method  of  scanning  is  not  important. 
Each  elemental  area  of  the  paper  on  which  the  character  is  found  is 
transmitted  into  something  we  shall  call  the  “shift  register.”  This  is  a  set 
of  electronic  pigeon  holes,  where  each  box  represents  a  small  area  on  the 
paper.  The  diameter  of  this  area  is  usually  of  the  order  of  magnitude  of 
the  thickness  of  a  character  line,  that  is  the  thickness  of  the  black  line  of 
which  a  character  is  composed. 

A  typical  correlation  machine  works  as  follows.  Each  electronic  pigeon 
hole  has  two  output  terminals,  one  of  which  we  shall  call  the  “assertion 
terminal,”  and  the  other  the  “negation  terminal.”  Let  us  assume  that  if  the 
elemental  area  corresponding  to  a  particular  pigeon  hole  is  black,  the  asser¬ 
tion  terminal  of  this  particular  register  element  will  show  a  potential  +  6 
volts,  and  the  negation  terminal  in  this  case  will  show  —  6  volts  (relative 
to  ground).  If  this  particular  element  represents  a  white  spot  on  the  paper, 
the  assertion  terminal  will  show  —  6  volts,  and  the  negation  terminal  will 
show  +  6  volts.  Thus,  the  two  voltages  are  the  inverse  of  each  other,  de¬ 
pending  whether  the  element  being  examined  is  black  or  white.  In  actual 
practice  one  can  have  a  continuous  range  of  the  gray  scale,  represented 
by  a  continuous  scale  of  voltages;  or  one  can  quantize  the  black  into  several 
levels  so  that  there  are  discrete  voltages,  but  in  several  steps.  For  the  present 
discussion  the  method  used  is  immaterial. 

Assume,  now,  that  a  character  is  being  examined  and  the  character  is, 
say,  the  numeral  “1.”  The  middle  vertical  row  of  the  register  would  then 
consist  of  register  elements  whose  assertions  will  be  +  6  volts,  and  whose 
negations  will  be  —  6  volts.  On  both  sides  of  this  row  of  “black”  cells  will  be 
“white”  cells  whose  assertions  will  be  —  6  volts  and  whose  negations  will 
be  +  6  volts.  If  I  now  want  to  examine  whether  this  register  is  storing  the 
figure  “1,”  I  would  connect  wires  to  the  assertions  of  the  middle  row  cells 
(that  is  to  all  those  points  I  expect  to  be  +  6  volts)  and  to  the  negation 
terminals  of  all  those  cells  which  I  expect  to  be  white  (that  is,  to  the 
negation  terminals  which  also  should  be  +  6  volts).  If  each  wire  goes  to 
one  resistor,  1  will  have  a  group  of  resistors,  say  20  to  50,  all  of  which 
should  be  connected  to  -f-  6  volts,  if  the  figure  “1”  has  been,  or  is  being, 
stored  in  this  register.  If  I  now  tie  all  the  other  (free)  ends  of  the  resistors 


247 


Logic  Used  in  Recognition  Machines 

together,  I  should  have  a  wire  or  bus  which  should  be  at  -|-  6  volts  potential. 
In  practice,  we  never  get  a  perfect  correlation  between  the  unknown  char¬ 
acter  and  the  one  we  expect  to  find  in  the  register,  and  so  the  voltage  is 
never  exactly  -f-  6.  It  may  be  something  close  to  it,  as  I’ll  descibe  later. 

At  the  same  time  that  I  am  looking  for  a  “1”  in  the  register,  I  can  con¬ 
nect  a  set  of  resistors  for  the  figure  “2”;  that  is,  I  look  to  other  assertions  and 
other  negations.  All  of  these  resistors  will  be  connected  to  another  bus, 
the  “2”  bus.  1  can  do  the  same  thing  for  the  figure  “3,”  “4,”  for  the  letters 
“A,”  “B,”  “C,”  “D,”  in  English,  and  for  the  letters  of  any  other  alphabet  or 
any  symbol  other  than  alphanumeric.  For  each  such  character  I  will  have 
a  separate  bus.  When  I  feed  any  character  into  a  particular  resistor  set  or 
matrix,  I  will  get  some  voltage  between  -f-  6  and  —  6  on  the  corresponding 
bus.  The  closer  it  is  to  +  6,  the  more  nearly  the  unknown  character  re¬ 
sembles  the  particular  character  represented  by  the  resistor  matrix  J  ex¬ 
amine. 

These  resistor  matrices  we  call  “correlation  matrices”  because  they 
indicate  the  degree  of  correlation  between  the  unknown  character  and  each 
resistor  matrix.  In  one  such  machine  that  we  have  in  our  laboratory,  we 
have  provided  for  four  alphanumeric  fonts  to  be  read  simultaneously.  The 
only  limit  in  such  devices  is  how  much  current  is  available  at  each  of  the 
terminals  at  each  register  element. 

Now  the  way  we  can  recognize  the  unknown  character  is  to  say  simply 
that  the  voltages  nearest  to  +  6  (the  highest  positive  voltage)  will  win, 
and  that  it  is  the  character  in  the  register.  We  can  add  other  criteria.  We 
can  require  that  it  must  produce  not  only  the  highest  voltage,  but  also  that 
there  should  be  no  other  near  it  (say,  nothing  within  a  volt  or  two).  Fur¬ 
thermore,  we  can  limit  the  recognition  further  by  requiring  that  the  voltage 
must  be  above  some  predetermined  minimum  (such  as  +  3  volts)  to  prevent 
the  recognition  of  random  smudges  as  characters.  We  can  set  limits  on  the 
voltage  that  depend  on  the  general  blackness  of  the  character.  And  so  on 
and  on. 

The  criteria  for  recognition  and  nonrecognition  of  a  character  depend 
on  what  we  intend.  For  example,  if  we  are  collecting  statistics  and  an  oc¬ 
casional  error  is  not  important  we  can  accept  any  best  match.  This  is  par¬ 
ticularly  true  when  the  only  errors  are  random  in  occurrence.  If  we 
are  doing  money  accounting  and  any  error  is  serious,  we  would  prefer 
to  have  “rejected”  characters  rather  than  misread  characters  or  “errors.” 
We  would  program  the  machine  for  all  the  stringent  criteria  mentioned 
above. 

If  there  is  more  than  one  font  in  the  machine,  one  could  take  the  ob- 


248  Man-Machine  Systems 

vious  approach  that  the  best  character  of  any  font  is  recognized — except 
that  the  choice  then  may  be  between  several  hundred  different  characters, 
and  the  differences  among  many  characters  will  be  very  small  and  un¬ 
certain.  A  better  way  would  be  to  determine  which  font  the  machine  is 
reading  and  then  to  eliminate  the  others  from  consideration.  Thus  if  there 
are  Fonts  1,  2,  3  being  read,  and  Font  1  comes  up  best  80  percent  of  the 
time,  Font  2  comes  up  best  15  percent  of  the  time,  and  Font  3  comes  up 
best  only  5  percent  of  the  time,  one  can  properly  conclude  that  Font  1  is  the 
correct  font.  Then  we  can  electrically  disconnect  the  circuits  of  Fonts  2  and  3 
and  read  the  output  as  if  only  one  font  existed  in  the  machine.  This  can  be 
done  quickly  automatically,  and  thus  reduce  the  number  of  choices  to  a 
minimum. 

There  may  be  cases  where  the  fonts  are  truly  intermixed,  in  which  case 
one  has  to  use  brute  force.  There  is  other  information  about  characters  that 
one  can  utilize.  For  example,  in  reading  a  phone  book,  we  know  that  the 
name  of  the  entry  (the  name  of  the  person)  is  in  a  different  style  from  the 
address.  We  could  automatically  make  the  machine  switch  the  correct 
resistor  correlation  boards  into  the  circuits  when  the  name  is  read,  then 
switch  others  to  read  the  address,  still  others  to  read  the  name  of  the  ex¬ 
change,  and  still  others  to  read  the  numbers. 

In  general,  the  more  information  one  has  about  the  printed  material,  the 
more  sophisticated  the  machinery  can  be,  the  more  accurate,  and  I  must  add, 
the  more  expensive.  Actually  the  machine  we  have  in  our  laboratory,  which 
can  read  several  fonts  of  alphanumeric  characters,  has  cost  us  several  hun¬ 
dreds  of  thousands  of  dollars.  Such  an  expensive  toy  is  obviously  not  suited 
to  be  a  reading  machine  for  the  blind.  However,  a  great  many  things  that  I 
have  spoken  about  are  not  necessary  for  that  purpose.  A  machine  for  the 
blind  can  be  made  slower,  the  intelligence  of  the  blind  person  can  be  used  to 
help  the  machine,  and  many  secondary  functions  and  output  devices  could 
be  eliminated  or  minimized  so  that  a  machine  for  the  blind  could  certainly 
be  made  for  the  fraction  of  the  cost  of  the  monsters  I’ve  described — par¬ 
ticularly  if  one  could  build  many  of  them. 

As  I  have  mentioned,  scanning  techniques  used  by  different  people  vary. 
For  example,  in  one  of  the  IBM  commercial  machines,  instead  of  using  a 
single  element  to  scan  and  define  points  on  a  paper,  six  short  strokes  are 
used,  clustered  very  closely  together  and  later  combined  into  a  single  ele¬ 
mental  area.  In  this  way  some  preprocessing  of  the  optical  data  is  done  be¬ 
fore  it  goes  into  the  machine.  A  small  spot  of  dirt  can  be  detected  and  elimi¬ 
nated,  and  the  short  scans  can  be  analyzed  to  determine  whether  or  not 


249 


Logic  Used  in  Recognition  Machines 

the  character  is  black  or  white  at  that  point.  The  reason  for  using  fine  scan¬ 
ning  and  combining  the  information  before  going  on  to  the  reading  logic  of 
the  machine  is  rather  interesting.  Suppose  one  considers  a  very  large  char¬ 
acter,  but  one  made  up  of  very  thin  lines — the  kind  one  would  draw  on  the 
blackboard.  If  we  employ  a  scanner  whose  elemental  area  is  equal  to  the 
thickness  of  this  line,  we  would  need  perhaps  a  thousand  horizontal  scans, 
and  each  scan  would  have  to  be  divided  into  a  thousand  elemental  areas. 
This  would  mean  that  the  total  scanned  area  would  have  one  million  bits. 
The  logic  that  would  have  to  deal  with  this  quantity  of  information,  the 
shift  registers,  and  so  on,  would  become  horribly  expensive. 

The  trick  employed  in  this  type  of  system  is  to  use  the  fine  spot  for 
scanning  to  detect  the  thin  line,  but  to  throw  away  useless  information  before 
one  goes  into  the  logic  of  the  machine.  IBM  does  this  by  using  small  spots 
which  are  combined  into  big  spots  before  the  logic  section.  Another  way 
of  doing  the  same  thing  is  to  use  what  we  call  a  “sample  scan.”  With  this 
technique  one  covers  the  area  not  by  adjacent  lines  but  by  lines  spaced  quite 
far  apart.  For  example,  if  a  large  “2”  were  drawn  on  the  blackboard,  I  can 
draw  perhaps  half  a  dozen  vertical  lines  far  apart  that  cross  the  area  on 
which  the  “2”  is  located.  Then  I  could  draw  six  horizontal  lines,  also  far 
apart,  to  cover  the  same  area  but  at  right  angles  to  the  first  lines.  If  one  ex¬ 
amines  where  these  12  lines  cross  the  figure  “2,”  one  sees  that  these  crossing 
points  can  define  the  “2”  sufficiently  well  with  12  scans  instead  of  perhaps  a 
thousand.  In  other  words,  one  uses  a  sample  of  the  total  area.  This  means 
that  small  detail  between  the  lines  will  be  lost,  but  if  one  assumes  that 
characters  have  no  features  which  are  very  small  relative  to  their  total  size, 
this  chance  can  be  taken.  This  type  of  sample  scanning  is  good  for  thin  lines 
which  are  continuous,  where  it  is  safe  to  assume  that  a  line  crossing  the 
area  will  cross  the  character. 

This  freehand,  thin  drawing  of  a  “2”  brings  me  to  the  next  matter, 
namely  the  reading  of  handwritten  letters  and  numbers.  We  believe  that  it 
is  not  possible  at  this  stage  to  read  cursive  writing  of  alphabetic  material.  I 
believe  that  MIT  studies  recently  have  proven  this,  although  I  don’t  know 
how  rigorous  such  a  proof  can  be.  I  suggest  the  reader  try  the  following: 
write  English  words  carefully,  the  way  you  have  been  taught  to  do  in  school 
years  ago;  then  cover  some  of  the  letters  and  try  to  figure  out  what  the  other 
letters  are.  You  will  find  that  without  seeing  the  whole  word  there  simply 
isn’t  enough  information  to  describe  individual  letters.  Certainly  this  is  true 
with  any  ordinary  cursive  writing;  one  has  to  read  by  words.  This  can  be 
done  by  machines  too,  but  the  storage  of  complete  word  pattern,  or  features, 


250  Man-Machine  Systems 

is  very  expensive  and  as  far  as  we  know,  nobody  has  tried  it  seriously.  There 
have  been  attempts  at  recognition  of  some  general  shapes  of  words,  where 
vocabulary  is  limited  to  a  few  words  such  as  names  of  several  major  cities 
of  the  United  States. 

Reading  handwritten  numbers,  on  the  other  hand,  is  possible  and  there 
have  been  at  least  two  machines  built  to  do  so.  We  have  one  that  can  read 
ordinary  handwritten  numbers  if  reasonably  well  formed.  By  ‘‘reasonably,” 
I  mean  that  “8’s”  have  to  have  two  loops,  the  “6”  one,  but  also  that  the  char¬ 
acters  do  not  have  to  be  of  a  particular  size,  a  particular  slant,  or  a  particular 
shape.  This  type  of  reading  is  best  described  as  reading  by  curve  tracing, 
and  there  are  several  ways  of  doing  this.  For  example,  if  I  draw  a  figure  “4” 
and  use  a  flying  spot  scanner  I  can  make  the  spot  circle  in  a  small  arc.  Skip¬ 
ping  for  a  moment  the  problem  of  placing  the  spot  on  the  character  in  the 
first  place,  let  us  assume  that  the  spot  starts  at  some  point  on  the  left  side  of  the 
”4”;  as  the  light  circles  and  crosses  and  re-crosses  the  line  of  the  character, 
the  photoelectric  system  and  logic  can  tell  where  these  crossings  occur.  Then 
the  spiraling  or  circling  spot  of  light  can  be  moved  along  the  line  until  an 
intersection  of  the  horizontal  bar  of  the  ‘‘4”  and  the  vertical  bar  are  reached. 
I  can  tell  the  machine  that  in  the  case  of  such  an  ambiguous  position  it  can 
always  take  the  upper  leg  of  the  fork  first;  then  the  spot  of  light  will  circle, 
going  upward  along  the  line  until  the  end  is  found.  At  the  end,  it  reverses 
the  direction  of  travel  and  goes  down  to  the  bottom.  Finding  no  more  places 
to  go,  the  machine  says  in  effect,  “Fve  examined  the  character;  I  know 
which  way  I  have  moved  along  the  lines,  I  can,  therefore,  say  that  there  was 
a  short  vertical,  a  horizontal  that  later  joined  a  long  vertical,  and  that  this 
is  a  ‘4.’  ”  This  type  of  machine  can  be  built.  It  has  been  played  with  and  it  is 
described  in  the  U.  S.  Patent  Office.  It  is  a  very  difficult  machine  to  build, 
but  is  entirely  possible. 

Now  let  me  show  a  somewhat  better  method  of  scanning  using  a  circling 
spot  of  light — a  method  which  does  not  entail  the  difficulty  of  making  de¬ 
cisions  when  a  fork  in  the  road  is  reached.  Assume  that  the  character  is  thick 
(this  is  more  nearly  in  accordance  with  reality),  say,  again  a  “4,”  but  a  thick 
one.  Now  the  spot  of  light  is  arranged  not  to  move  across  the  line,  but  to 
touch  it  lightly:  the  circling  path  of  scanning  just  enters  into  the  black  area 
and  comes  out  again.  Again,  with  suitable  logic,  one  can  tell  when  the  circle 
is  on  black  and  when  it  is  on  white.  The  diameter  of  the  circles  in  this 
curve  tracing  technique  is  two  or  three  times  greater  than  the  thickness  of  the 
line;  and  the  spot  involved  is  perhaps  half  that  diameter.  If  the  circle  now 
is  made  to  follow  the  line  it  does  not  have  any  difficult  decisions  to  make.  It 


251 


Logic  Used  in  Recognition  Machines 

follows  the  outside  of  the  line,  going  down  on  the  left  side  of  the  character, 
up  on  the  right  side  of  the  character,  and  finally  the  path  closes  on  itself.  In 
the  case  of  Arabic  numbers  and  the  English  capital  alphabet  there  are  no 
internal  curves  that  we  need  be  concerned  with.  The  character  “A,”  has  a 
closed  loop,  but  one  doesn’t  have  to  go  inside  of  it  to  determine  that  the 
character  is  a  capital  “A.”  For  capitals  and  numbers,  then,  the  outside  of 
the  character  is  sufficient. 

In  the  case  the  character  “©,”  we  have  a  case  where  the  outside  looks 
like  an  “O,”  but  one  has  to  go  inside  the  circle  to  find  that  there  is  a  cross 
bar.  For  such  a  character,  the  system  will  not  work  and  one  would  have  to 
do  something  else.  The  dotted  “I”  is  treated  as  two  separate  characters;  they 
are  read  separately  and  they  are  combined  later  in  the  output  logic  of  the 
machine  or  the  computer  which  a  reading  machine  feeds.  This  type  of  dif¬ 
ficulty  is  also  true  of  some  Russian  characters  which  consist  of  two  separate 
elements;  they  are  read  as  separate  characters  and  the  utilization  device  has 
read  the  two  characters  together.  It  does  not  make  any  difference  whether 
the  word  has  five  characters  or  six,  so  long  as  you  recognize  the  word. 

Let  me  tell  you  of  a  curve  tracing  machine  where  the  curve  tracer  does 
not  follow  the  lines  of  the  characters  in  a  real  sense,  but  only  in  a  philosophi¬ 
cal  sense.  This  machine  was  invented  by  Arthur  Holt,  who  is  Vice  President 
of  Electronics  in  our  Company;  the  machine  works.  This  machine  scans  with 
an  ordinary  vertical  row  of  photocells;  we  could  have  used  a  mechanical 
scanner  that  scans  vertical  lines  moving  across  the  character:  the  system 
of  scanning  is  immaterial.  Let  me  assume  a  mechanical  scanner  because  it 
makes  the  operation  easier  to  explain.  Take  a  capital  “A”  as  the  character 
to  be  scanned.  Vertical  rows  are  scanned  moving  from  left  to  right.  As 
soon  as  the  scan  “hits”  or  touches  the  first  point  (the  bottom  of  the  left  leg  of 
the  “A”)  we  assign  a  circuit  called  “Watchbird  No.  1”  to  this  black  line.  The 
second  scan  will  touch  the  same  leg  slightly  higher,  but  the  spots  will  be 
adjacent  to  each  other.  The  third  scan  will  touch  the  same  leg  of  the  “A”  still 
higher,  and  the  Watchbird  No.  1  will  follow  this  leg  on  successive  scans  as 
it  moves  from  left  to  right  on  the  character.  Finally,  it  will  come  to  a  point 
where  the  horizontal  bar  of  the  “A”  begins — where  a  successive  scan,  instead 
of  hitting  a  single  line,  experiences  two  independent  blacks  (two  separate 
black  points)  one  above  the  other.  The  machine  then  makes  note  of  the  fact 
that  the  line  has  split  into  two  and  records  a  “split.”  At  the  same  time,  we  as¬ 
sign  a  second  circuit  to  follow  the  line  above  the  horizontal  bar.  Watchbird 
No.  1  will  continue  tracing  the  horizontal  bar,  and  a  new  circuit  called 
“Watchbird  No.  2”  will  follow  the  line  above  it.  As  it  keeps  moving  to  the 


252  Man-Machine  Systems 

right,  it  will  continue  to  record  the  fact  that  these  lines  are  continuous  to  the 
right.  After  some  additional  scans  (perhaps  another  dozen  or  so)  the  ma¬ 
chine  will  suddenly  discover  that  the  two  points  it  is  following  coincide  and 
become  one.  We  call  this  a  “joint.”  The  machine  again  records  the  interest¬ 
ing  fact  that  a  joint  has  occurred,  at  which  point  it  no  longer  needs  two 
Watchbirds.  Watchbird  No.  2  is  discarded,  and  Watchbird  No.  1  continues 
to  follow  the  line  down  on  the  right  leg  until  there  is  nothing  else  to  follow, 
at  which  time  the  character  is  finished. 

The  events  that  we  have  recorded  are:  a  line  starting  on  the  left  side, 
splitting  into  two,  joining  into  one  again,  becoming  one  for  a  while,  and 
then  disappearing.  This  is  a  capital  “A.” 

For  reading  the  letter  “O”  or  the  numeral  “zero”  the  logic  will  be  that  a 
single  black  spot  will  immediately  split  into  two  lines;  the  two  lines  will 
continue  one  above  the  other  until  the  two  lines  join;  immediately  after  there 
is  nothing  else.  This  is  always  an  “O”  or  a  “zero.”  You  will  note  that  this 
type  of  logic,  in  the  simplest  form  can  take  care  of  simple  letters  and  num¬ 
bers  independently  of  their  size,  their  orientation,  and  their  exact  shape.  We 
can  program  circuits  into  this  type  of  machine  to  determine  whether  the  line 
is  rising  or  falling,  how  long  it  continues,  how  far  lines  are  apart  from  each 
other,  and  so  on.  All  this  makes  for  complexity  and  cost.  For  simple,  care¬ 
fully  written  numerals,  i.e.,  in  which  the  “4”  is  made  with  an  open  top  to 
avoid  confusion  with  a  “9,”  very  simple  curve  tracing  logic  of  this  type  is  all 
that  is  required. 

There  are  some  basic  difficulties  with  curve  tracing  machines  in  general. 
The  main  one  is  that  if  the  character  is  broken  (if  it  is  made  with  a  type¬ 
writer  ribbon  which  is  light  so  that  the  character  really  consists  of  individual 
spots  or  little  dashes)  the  curve  tracer  gets  lost.  One  can  design  sophisticated 
filters  and  flywheel  circuits  into  the  curve  tracer  so  that  it  can  “jump”  small 
gaps,  but  then  it  has  trouble  with  smudgy  characters  that  cause  jumping  in 
the  wrong  places.  One  can  first  clean  up  the  character  by  various  tricks  of 
prefiltering,  but  by  the  time  one  finishes  one  has  quite  a  bit  of  the  correlation 
machine  built  and  one  might  just  as  well  build  correlation  machines  which 
can  handle  spotty,  dirty  characters  which  curve  tracers  have  great  difficulty 
in  following.  Curve  tracers,  then,  are  very  good  for  thin,  clean  lines  where 
correlation  matrices  have  to  be  very  large  or  very  sophisticated. 

Now  I  would  like  to  discuss  some  other  type  of  machines,  often  called 
“feature  analysis”  machines.  One  example  is  a  machine  that  is  used  by  the 
Farrington  Corporation,  which  has  built  many  stroke  analysis  machines.  If 
one  deals  with  characters  which  are  of  relatively  simple  shapes,  such  as  those 


253 


Logic  Used  in  Recognition  Machines 

made  up  for  example  of  vertical  and  horizontal  strokes  only,  a  relatively  in¬ 
expensive  and  straightforward  machine  can  be  built.  Suppose  one  scans  top 
to  bottom  with  a  mechanical  Nipkow  disk  type  of  scanner.  One  can  then  read 
a  “4”  as  a  short  line  in  the  upper  part  of  the  character  followed  by  a  hori¬ 
zontal  line  in  the  middle  of  the  character,  which  then  is  followed  by  a  long 
vertical  line  on  the  right  side  of  the  character.  This  type  of  sequence  of  events 
can  be  easily  determined;  one  can  scan  each  line  more  than  once  to  make 
sure  that  there  is  no  ambiguity.  The  machine  knows  where  the  vertical  line 
sections  are  because  the  scan  is  timed  and  divided  into  small  elements  or 
lengths.  One  also  knows  the  slope  of  the  lines.  This  type  of  logic  can  read 
not  only  simple  “match  stick”  numbers,  but  complicated  ordinary  char¬ 
acters  as  in  books.  In  the  latter  case,  the  logic — the  set  of  rules  put  into  the 
machine — is  no  longer  simple  and  the  machine  gets  progressively  more  com¬ 
plicated  and  more  expensive.  Farrington  has  built  machines  that  read  ordi¬ 
nary  characters  very  well. 

Some  machines  do  not  use  scanning  in  the  ordinary  sense  at  all.  These 
machines  are  map-matching  machines,  which  are  certainly  the  oldest  type  of 
reading  machines  mentioned  in  the  art.  I  built  one  of  these  at  the  National 
Bureau  of  Standards  many  years  ago;  Baird-Atomic  is  playing  with  one  now. 
Many  people  have  tried  this  type  of  equipment  because  it  appears  so  simple 
and  so  attractive. 

In  this  approach  the  character  on  the  paper  is  projected  onto  a  trans¬ 
parency,  or  more  properly  onto  a  mask.  Let  us  assume  that  a  character  is 
again  the  capital  letter  “A.”  It  is  projected  by  a  lens  onto  a  mask  which 
has  a  transparent  section  shaped  exactly  like  the  image.  The  mask,  then,  is 
all  opaque  where  the  paper  is  white,  and  transparent  where  the  character 
is  black.  If  one  places  a  photocell  behind  the  mask,  one  would  expect  no 
light  to  go  through  to  the  photocell  when  a  perfectly  black  “A”  is  projected 
onto  this  mask.  Unfortunately  there  are  no  perfectly  black  characters :  they 
all  reflect  some  light;  there  is  diffusion  in  the  lens;  and  so  on.  What  the 
photocell  senses  is  less  light,  but  not  extinction.  This  is  where  the  difficulty 
lies.  Another  character  that  looks  somewhat  like  an  “A”  may  also  give  some¬ 
thing  close  to  the  same  degree  of  extinction  as  an  “A.”  One  could  “nor¬ 
malize”  the  photocell  response  to  the  particular  areas  involved,  but  such 
differences  become  very  small  particularly  with  alphanumeric  characters,  and 
this  type  of  machine  doesn’t  work  well.  No  practical  machines  of  this  type 
have  ever  been  built  to  the  best  of  my  knowledge.  Moreover,  there  is  always 
the  difficulty  of  positioning  the  character  so  that  it  overlies  the  mask  ex¬ 
actly,  and  the  normalizing  of  characters  for  the  areas  is  difficult  and  ex- 


254  Man-Machine  Systems 

pensive;  since  characters  vary  so  much  in  their  total  blackness  or  in  their 
continuity  (that  is,  they  are  not  necessarily  all  even  in  density  and  may  even 
have  gaps),  this  type  of  machine  is  more  of  a  laboratory  curiosity  than  a 
practical  device. 

Because  we  have  learned  in  recent  times  so  much  about  reading  ma¬ 
chines  using  resistor  correlation  boards,  we  have  been  able  to  attack  this 
problem  again  and  build  some  mask  machines  which  do  work,  at  least  for 
numerals.  First,  let  me  describe  a  mask  machine  which  I  built  in  the  National 
Bureau  of  Standards  which  eliminates  this  difficulty  of  small  differences  be¬ 
tween  characters.  In  this  machine,  a  character  is  projected  onto  a  mask,  but 
instead  of  feeding  all  of  the  light  going  through  the  mask  onto  a  photocell 
the  mask  is  scanned,  one  point  at  a  time.  This  was  done  by  a  simple  Nipkow 
disk  spinning  behind  the  mask.  If  the  character  is  gray  and  fits  the  mask  as 
it  is  scanned  (the  direction  of  scan  is  immaterial)  small  amounts  of  light 
go  into  the  photocell.  These  small  amounts  are,  say,  a  quarter  of  what 
would  obtain  if  plain  white  paper  were  projected  onto  the  mask.  If  now  a 
character  is  projected  which  is  almost  of  the  same  shape,  but  differs  in  one 
or  two  points,  out  of  the  photocell  will  come  one  or  two  very  large  “spikes” 
of  current  which  correspond  to  white  areas.  If  one  uses  a  peak  detector  the 
differences  are  very  great  between  the  character  that  matches  and  the  one 
that  doesn’t:  while  “O”  and  “Q”  may  only  differ  by  two  to  three  percent  in 
total  area,  the  mismatch  points  can  make  output  vary  in  a  ratio  of  10  to  1. 
This  particular  machine  had  no  difficulty  separating  characters  which  are 
otherwise  very  similar. 

The  interesting  thing  about  this  machine  is  that  it  is  almost  exactly  iden¬ 
tical  to  a  machine  that  scans  the  character  first  and  sends  the  output  of  the 
scanner  into  correlation  matrix  in  which  the  differences  are  greatly  exag¬ 
gerated.  In  the  actual  case,  of  course,  we  first  projected  the  image  onto  the 
correlation  matrix  (the  mask)  and  scanned  the  output  afterward. 

This  type  of  machine  can  be  built  for  the  blind  and  one  could  probably 
make  it  cheap  enough  and  slow  enough  to  be  useful.  The  scanning  must 
occur  for  each  superposition  of  the  character;  and  if  one  were  to  do  this 
superposition  of  masks  serially,  one  could  build  a  machine  that  uses  only  one 
photocell,  a  simple  scanner,  a  simple  mask  disk  which  contains  all  the  trans¬ 
parencies,  and  if  the  blind  user  does  the  physical  work  of  positioning  the 
paper  and  some  other  tasks  of  this  type,  a  machine  could  be  built  for  a  few 
thousand  dollars. 

I  would  now  like  to  go  back  for  a  moment  to  the  resistor  correlation 
machine  which  I  described  first.  I  neglected  to  mention — and  it  is  just  as 


255 


Logic  Used  in  Recognition  Machines 

well  to  introduce  it  at  this  point — that  the  difference  between  “O”  and  “Q” 
can  be  greatly  exaggerated  in  the  resistor  machine  by  giving  weight  to  the 
points  which  contain  the  tail  of  the  “Q.”  If  I  draw  a  large  “Q”  and  a  large 
“O,”  your  eye  immediately  focuses  on  the  little  tail.  This  may  be  a  small  part 
of  the  character,  but  it  is  the  feature  that  specifies  the  difference  between 
them.  Your  eye  and  your  mind  do  this  extremely  easily  and  you  are  not  even 
conscious  of  the  process.  In  a  reading  machine,  which  is  not  by  any  means 
as  clever  as  you  are,  we  have  to  give  this  tail  extra  weight  to  exaggerate  the 
differences,  and  we  can  do  that  by  using  resistor  values  which  are  different 
from  the  resistors  connected  to  other  points,  so  that  the  tail,  which  may  be 
only  a  few  percent  of  the  total  area,  may  produce  a  10  or  20  percent  differ¬ 
ence  in  the  output.  You  might  ask  why  we  do  not  assign  the  tail  a  much 
greater  difference.  The  reason  is  that  we  can’t  put  that  many  eggs  into  one 
basket.  The  “Q”  has  to  separated  from  other  characters;  the  “O”  has  to  be 
separated  from  “C”  and  “G” — and  so  on;  and  so  we  give  extra  weight,  but 
not  the  entire  weight,  to  the  difference  between  “O”  and  “Q.” 

We  can  do  the  same  thing  in  mask  matching  machines  by  giving  weight 
to  different  parts  of  characters.  This  is  done  by  superimposing  a  gray  scale 
on  the  mask  so  that  certain  parts  of  the  mask  have  more  importance  than 
others.  In  the  “0”/“Q”  case,  we  can  cover  most  of  the  circle  of  the  char¬ 
acters  by  a  gray  mask  (which  transmits  perhaps  half  as  much  light  as  the 
clear  transparency)  so  that  the  part  containing  the  tail  gets  an  added  weight. 
This  can  be  done  for  other  characters  and  the  ratios  can  be  anything  one 
likes. 

As  we  worked  on  these  machines  it  apeared  that  a  mask  machine  does 
not  need  to  be  scanned  to  become  a  practical  device,  that  one  could  develop 
“assertion”  and  “negation  masks”  for  each  character.  One  is  a  transparency 
which  says  where  the  character  should  be;  the  other  transparency  tells  the 
machine  where  the  character  should  not  be.  The  output  of  the  negation 
mask  is  inverted  and  added  to  the  assertion  mask,  and  the  two  voltages 
together  indicate  the  correlation  between  the  unknown  character  and  the 
pair  of  masks.  By  having  such  a  pair  of  masks  for  each  character,  one  can 
then  determine  the  best  correlation  between  any  set  of  masks  and  the  un¬ 
known  character,  and  the  “best  set”  can  be  chosen,  with  whatever  other 
criteria  you  wish  to  put  into  the  system.  We  have  actually  built  such  ma¬ 
chines  using  two  photocells  and  sets  of  masks.  One  of  the  interesting  things 
about  such  mask  machines  is  they  have  no  regard  for  the  intricacies  of  the 
font.  Since  the  mask  is  a  photographic  image  of  the  character  (somewhat  re¬ 
touched),  no  special  font  designs  need  to  be  practiced.  It  is  obvious,  of 


256  Man-Machine  Systems 

course,  that  simple  characters  which  differ  from  each  other  very  greatly 
are  still  the  best,  but  ordinary  characters  will  do.  There  is  no  element-by¬ 
element  scanning  in  such  machines;  the  resolution  can  be  considered  to  be 
infinite.  The  problem  of  positioning  is  quite  serious,  however,  because  the 
character  image  has  to  fall  accurately  upon  each  pair  of  masks.  Not  knowing 
what  the  character  is,  the  image  must  fall  on  all  of  the  sets  of  masks.  This 
can  be  done  in  parallel  (that  is,  the  image  can  be  split  and  projected  simul¬ 
taneously  onto  the  complete  set  of  masks)  or  it  can  be  done  serially  (the 
image  projected  on  one  set,  then  on  the  next,  then  the  next,  and  so  on).  If 
done  serially,  only  two  phototubes  are  needed  (as  a  matter  of  fact,  if  the 
machine  is  pushed  hard  enough  it  can  be  done  with  one,  but  this  is  kind 
of  tricky).  A  straight-forward  approach  requires  two  photocells,  and  the  ma¬ 
chine  has  to  remember  what  each  correlation  voltage  is,  and  which  is  the 
best  after  going  through  all  the  masks.  If  one  has  money  but  is  short  of 
time,  the  parallel  approach  can  be  used:  split  the  image  and  use  a  pair  of 
photomultipliers  for  each  set  of  masks.  It  has  been  tried  by  many,  including 
us,  to  have  the  cake  and  to  eat  it  faster.  The  registration  in  one  direction  can 
always  be  assured  by  moving  the  paper;  and  vertical  registration  can  be 
accomplished  either  by  oscillating  a  mirror,  revolving  a  prism,  or  moving 
the  masks.  In  any  case  a  great  deal  of  time  is  lost  in  trying  to  superimpose 
the  characters  on  the  mask.  If  we  were  designing  such  a  reader  for  any 
practical,  commercial  purpose,  the  superposition  has  to  occur  very  quickly, 
say  in  a  few  microseconds.  For  use  by  the  blind,  where  ten  characters  per 
second  may  be  enough,  the  positioning  problem  is  much  less  severe. 

The  problem  of  paper  moving  is  not  a  trivial  one.  If  one  moves  separate 
documents,  they  must  be  moved  extremely  fast:  10,  15,  or  20  documents 
per  second.  In  one  case,  we’ve  been  asked  to  move  80  documents  per  second; 
this  is  really  pushing  paper  around.  In  case  of  books  much  more  time  per 
page  is  required;  even  at  high  speed  each  page  takes  several  seconds  to 
read.  There  are  many  ways  of  locating  the  character  on  a  page.  We  can 
move  the  paper  by  pure  simple  translation;  we  can  wrap  it  around  a  drum 
and  make  the  lines  into  a  continuous  helix;  we  can  oscillate  a  mirror  over 
the  page;  we  can  use  revolving  prisms,  multisided  mirrors,  and  so  on.  The 
particular  solution  depends  on  the  problem  and  the  type  of  paper.  The  best 
system  of  all,  of  course,  is  to  have  everything  typed  on  continuous  ribbon 
like  Western  Union  telegrams  and  run  the  ribbon  past  an  optical  system. 
Unfortunately,  things  don’t  normally  come  this  way,  and  the  solution  is 
not  quite  that  easy.  We  actually  have  built  machines  that  can  read  240 
lines  per  second.  This  means,  as  somebody  pointed  out  to  us,  that  one 


257 


Logic  Used  in  Recognition  Machines 

could  read  a  novel  in  perhaps  one  minute.  I  don’t  know  why  one  should 
want  to  read  a  novel  in  a  minute,  but  if  one  did  this  could  be  arranged. 
The  book  would  have  to  be  cut  up  first  and  rearranged  in  a  more  con¬ 
venient  form.  In  such  high  speed  machines  we  have  used  a  retina  approach 
with  the  retina  consisting  of  photomultipliers,  so  that  the  actual  reading 
time  is  negligibly  small.  To  assure  of  vertical  registration  we  make  the 
retina  always  taller  than  the  character,  and  use  many  resistor  matrices  in 
duplicate.  The  same  character  may  be  looked  for  in  many  places  in  the 
retina  so  that  no  time  is  wasted  shifting  the  character  around  electronically. 
In  the  shift  register  machines,  of  course,  the  character  is  shifted  around. 
This  wastes  time,  but  it  means  that  one  correlation  matrix  can  be  used  for 
each  character. 

Lest  the  reader  become  confused,  I  should  say  that  a  10,000  character 
per  second  machine  uses  a  retina.  A  normal  nonretina  machine,  using  a 
single  row  of  photocells,  will  do  perhaps  up  to  2000  characters  per  second, 
and  the  mechanical  mask  machines  (in  which  the  character  must  be  posi¬ 
tioned  on  the  mask)  can  conveniently  process  100  to  200  characters  per 
second.  The  interesting  thing  about  the  speed  of  such  machines  is  that  even 
if  characters  go  through  a  register  continuously  they  are  read  in  about  the 
same  time  as  when  the  image  is  wobbled  over  a  mask.  In  other  words  the 
actual  reading  may  have  to  be  done  in,  say,  10  microseconds.  It’s  a  ques¬ 
tion  of  how  one  likes  to  move  characters,  either  the  image  of  a  character 
over  a  mask  or  the  image  stored  electronically  in  the  register.  Shifting  is 
not  done  when  pushing  for  higher  speed;  the  character  is  read  whenever  it 
hits  somewhere  near  the  middle  of  the  retina  and  the  vertical  registration  is 
taken  care  of  by  redundant  correlation  matrices.  This  is  one  of  the  reasons 
for  the  high  cost  of  these  fast  machines.  Since  this  is  industrial  money 
we’re  spending,  money  which  is  hard  to  come  by,  we  don’t  make  too  many 
of  these  machines. 

I  would  predict  that  the  problem  of  reading  for  the  blind  will  be  solved 
by  reading  machines  of  the  general  type  I’ve  been  discussing,  and  not  by 
the  type  that  converts  from  a  visual  image  to  a  tactile  image,  or  some  sort 
of  “sound  image.”  I  believe  that  even  if  one  uses  some  sound  code  the 
user  cannot  absorb  the  code  fast  enough.  I  think  that  because  people  are 
lazy  (and  I  imagine  that  this  is  true  also  for  blind  people),  they  will  want 
to  have  the  visual  image  voiced  at  a  rate  of  perhaps  ten  characters  per 
second.  I  think  machines  can  now  be  built  for  relatively  little  money — of 
the  order  of  five  to  ten  thousand  dollars — machines  designed  especially  for 
the  blind  that  will  have  an  output  of  spoken  words.  I  suspect,  moreover,  the 


258  M cm-Machine  Systems 

way  in  which  this  will  be  done  is  that  the  common  words  will  be  stored  as 
words,  and  the  less  common  words  will  be  spelled  out.  This  seems  to  be  a 
good  compromise  for  the  next  ten  years  or  so.  I  do  not  know  exactly  how 
many  words  we  shall  need  for  ordinary  English.  I  suspect  that,  with  the 
exception  of  our  British  friends,  perhaps  two  to  five  thousand  words  may 
be  good  enough  and  the  rest  can  be  spelled  out  because  they  don’t  really 
come  up  often. 

One  could  build  an  optical  match  machine  today,  I  believe,  for  very 
little  money.  This  is  a  machine  that  can  read  at  least  one  font.  This  is  a 
case  where  a  blind  user  can  be  of  great  help  by  doing  several  tasks.  He  can 
change,  for  example,  the  height  and  width  of  the  characters  to  be  read.  In 
reading  a  book  one  doesn’t  have  to  read  the  capitals  at  all;  at  least  I  think 
so.  I  believe  that  if  one  skips  the  first  letter  of  each  sentence  (the  capital 
letter)  one’s  brain  could  fill  in  the  missing  character.  One  could  save  a 
large  cost  of  the  machine  by  thus  eliminating  capitals.  Also,  I  think  one 
could  read  newspaper  print  perfectly  well  without  the  headlines.  I  think, 
in  fact,  that  you  may  be  better  off  if  you  don’t  read  the  headlines.  I  be¬ 
lieve  that  if  you  read  the  small  text  of  the  New  York  Times  you  would 
get  all  the  meaning  well  enough  because  the  subject  headline  is  repeated, 
as  you  know,  and  the  first  paragraph  states  the  whole  story. 

If  you’re  willing  to  compromise  and  read  one  or  two  fonts,  adjust  the 
size  by  hand,  move  the  platen  by  hand,  and  so  on,  I  think  such  a  machine 
could  be  built  for  between  five  and  ten  thousand  dollars.  This  machine 
would  read  as  fast  as  one  could  absorb  the  information.  It  would  not 
have  too  sophisticated  an  output.  It  could  spell  out  the  speech,  which  is 
relatively  straightforward,  and  this  would  be  all. 

Today  we  are  building  commercial  machines  costing  from  50  thousand 
dollars  on  up.  These  contain  a  great  deal  of  housekeeping  machinery.  If 
one  eliminates  the  paper  moving  problem,  and  lets  the  blind  user  put  the 
book  under  plate,  adjust  its  position,  turn  all  the  knobs,  and  change  the 
font  (by  changing  a  mask,  for  example)  one  could  conceivably  make  a 
machine  for  ten  thousand  dollars.  I’m  talking  here  about  production  quan¬ 
tities  like  a  hundred  units — certainly  not  one  unit.  One  can’t  build  any¬ 
thing  for  ten  thousand  dollars,  not  even  a  phonograph  pick-up. 

I  would  like  to  say  this  about  recorded  words.  One  could  certainly 
play  them  out  at  the  rate  of  100  to  250  words  per  minute.  I  think  that 
one  doesn’t  need  this  speed,  but  one  could  get  any  speed  one  wishes.  These 
speech  output  machines  are  very  expensive  and  the  cost  would  be  in  the 
hundreds  of  thousands  of  dollars.  There  is  no  way  of  knowing  what  these 


259 


Logic  Used  in  Recognition  Machines 

machines  would  cost  in  quantity,  nor  is  it  likely  that  people  would  buy 
them  in  quantity.  One  doesn’t  need  great  sophistication  in  such  devices, 
and  one  could  compromise  a  great  deal  with  the  number  of  words  that  the 
machine  would  have  to  store.  I  think  there  are  people  who  are  very 
competent  who  have  worked  on  this  problem  of  word  output  (on  such 
matters  as  when  one  starts  a  word  and  when  one  finishes  it,  and  so  on). 
I  do  not  think  speech  produced  by  machine  would  be  particularly  pleasant 
to  listen  to.  It  would  be  monotonous,  but  it  certainly  could  spell  out  words 
that  are  not  stored. 


STAGE  OF  DEVELOPMENT  OF  AUTOMATIC 
CHARACTER  RECOGNITION  AND  COMPLEX 
READING  MACHINES  FOR  THE  BLIND 
IN  EUROPE 

HELMUT  KAZMIERCZAK 

Technische  Hochscluile,  Karlsruhe,  West  Germany 


Despite  the  burgeoning  development  in  all  fields  of  techniques  and  despite 
the  efforts  of  many  scientists  and  engineers  within  the  last  twenty  years,  there 
has  been  no  notable  success  in  developing  and  constructing  a  reading  device 
for  the  blind — a  device  which  might  be  considered  a  real  aid  to  the  skilled 
and  educated  blind  person.  On  the  one  hand,  simplex  reading  devices 
have  been  developed  in  which  the  optical  contrast  of  the  printed  script  is 
converted  into  excitation  patterns  for  one  or  another  of  the  senses  left 
to  the  blind  (i.e.,  hearing  or  touch).  Here  the  human  brain  has  to  interpret 
the  stimulating  patterns  received;  thus  the  recognition  process  is  left  to  men. 

On  the  other  hand,  complex  reading  machines  for  data  processing 
have  been  developed  for  entering,  sorting,  and  registration  of  checks  in 
banks,  for  forms  in  insurance  companies,  and  for  balancing  and  handling 
credit  cards  and  tally  rolls.  Furthermore,  reading  machines  are  being  de¬ 
veloped  and  designed  for  applications  like  documentation  and  automatic 
language  translation.  For  these  applications  the  recognition  process  of 
characters  read  is  done  automatically  by  the  machine  and  reading  speeds 
are  achieved  which  surpass  human  capabilities  by  far. 

So  far  as  simplex  reading  devices  are  concerned,  it  is  my  opinion  that 
they  cannot  satisfy  the  requirements  which  must  be  set  for  a  reading  aid 
for  the  blind.  Drawbacks  of  these  aids  include  the  limit  in  reading  speed 
of  about  three  characters  per  second  for  physiological  reasons,  regarding  an 
acoustical  output;  the  great  physical  and  mental  strain;  and  the  trouble  of 
learning  new  code  patterns.  Moreover,  the  replacement  sensory  channel 
capacity  is  restricted  by  the  doubled  strain  of  the  complex  reading  signal 
in  addition  to  its  normal  function.  I  believe,  therefore,  that  in  the  long  run 


261 


262  Man-Machine  Systems 

only  complex  reading  machines  will  meet  all  the  requirements  of  a  reading 
aid  for  the  blind. 

This  does  not  curtail  the  significance  of  simplex  reading  devices  by 
any  means.  Just  because  they  can  be  realized  by  relatively  simple  tech¬ 
niques,  some  devices  have  been  demonstrated  with  which  valuable  findings 
could  be  made.  Besides,  these  devices  can  be  used  for  such  simple  tasks 
as  checking  of  the  value  of  paper  money,  light  intensities,  or  colors,  and 
for  reading  short  typewritten  letters.  In  what  follows  a  statement  will 
be  given  of  what  extent  a  complex  reading  machine  may  be  appropriate 
as  a  reading  device  for  the  blind,  considering  the  most  up-to-date  techniques. 

Figure  1  gives  a  comparison  of  requirements  which  have  to  be  met  for 


features 

stage  of  development 

requirement 

application 

data  processing 

aid  for  the  blind 

documents 

tally  rolls 
cards 
cheoks 
pages 
miorof ilm 

letters 

newspapers 

books 

characters 

first  generation 
maohlnes 

second  generation 
machines 

one  printed  symbolized 
numerio  font 

one  printed  more  con¬ 
ventional  alphanumeric 
symbol  font 

more  than  one  printed, 
type  written  and  hand 
scribed  alphanumeric 
character  set 

reading  speed 

500....J000  oh/s 

lo....2o  ch/s 

operation 

transportation 

sorting 

scanning 

recognition 

,  automatic 

manual/mechanioal 

none 

manual /automatic 
automatic 

output 

oode  signals 
punohed  oard 
punohed  tape 
magnetic  tape 

spelled  speech 

speech 

ssourity 

reject  rate 
error  rate 

lOe  .  •  s  o , ol< 
very  low 

less  important  because 
of  word  redundancy 

expenditure 

firsts  costs 
maintenance 

. .  #  .loo. 000  $ 
high 

low 

low 

weight 

•  • .  .2.5oo  lbs • 

transportable 

Figure  1  Comparison  of  Reading  Machines  for  Different  Application 

reading  machines  for  data  processing  and  for  use  by  the  blind.  Nowadays 
machines  for  data  processing  read  printed  characters  from  documents  like 
tally  rolls,  cards,  checks,  occasionally  from  pages  (i.e.,  single  sheets),  and 
from  microfilms.  Regarding  the  stage  of  development  reached  today,  read¬ 
ing  machines  for  data  processing  may  be  divided  into  two  classes.  First 
generation  machines  are  capable  of  reading  only  distinct  printed  highly 
stylized  numerals  and  a  few  symbols  with  high  reading  speed  and  great 
accuracy.  Reading  rates  from  500  to  3000  characters  per  second  are  within 


Complex  Reading  Machines  263 

reach,  limited  only  by  the  document  transport  mechanism.  When  a  char¬ 
acter  cannot  be  identified  a  reject  is  made  by  the  machine.  The  reject  rate 
of  documents  reaches  from  10.0  to  0.01  percent  of  documents  processed, 
according  to  the  working  condition;  errors  made  by  the  machine  are  very 
few — about  one  error  per  million  correctly  read  characters. 

Second  generation  machines  are  capable  of  reading  a  more  conven¬ 
tional  printed  alphanumeric  character  font  including  only  capital  letters. 
In  contrast  to  practice  in  the  U.S.A.,  in  Europe  these  machines  have  not 
been  constructed  up  to  now  because  of  lack  of  business  interest,  since 
European  businessmen  are  interested  mostly  in  first  generation  machines. 

Finally,  we  may  consider  third  generation  machines,  which  are  capa¬ 
ble  of  reading  more  than  one  printed  script  simultaneously.  Machine  costs, 
however,  increase  rapidly  towards  machines  of  third  generation  status, 
while  the  accuracy  of  character  recognition  declines  at  the  same  time.  For 
this  reason  one  tries  to  standardize  on  optical  character  fonts  for  data 
processing  purposes.  This  would  be  a  great  advantage  for  applications 
which  can  afford  standardization. 

In  contrast  to  reading  machines  for  data  processing  a  reading  machine 
for  the  blind  has  to  read  various,  differently  printed,  typewritten,  or  even 
handscribed  characters  from  letters,  newspapers,  or  books. 

Machine  operations  for  data  processing  applications  have  to  be  fully 
automatic  and  include  such  refinements  as  document  transport,  document 
sorting,  character  scanning,  and  character  recognition;  when  a  reading 
machine  for  the  blind  is  concerned,  a  manual  mechanical  transport  of  the 
reading  material  which,  on  the  other  hand,  is  more  complex,  and  a  semi¬ 
automatic  character  scanning  are  thinkable.  No  document  sorting  is  re¬ 
quired.  All  this  is  justified  because  of  the  lower  reading  speed  of  about 
10  to  20  characters  per  second,  corresponding  to  normal  speed  of  conver¬ 
sation. 

With  reading  machines  for  data  processing  the  output  of  automatically 
recognized  characters  is  carried  by  coded  electrical  signals,  or  on  punched 
cards,  punched  paper  tape,  or  magnetic  tape.  A  reading  machine  for  the 
blind,  on  the  contrary,  requires  a  more  complex  output  like  an  under¬ 
standable  speech  or  a  spelled  speech  output. 

The  first  costs  and  the  maintanence  of  a  reading  machine  for  the  blind 
have  to  be  moderate  in  order  to  guarantee  that  a  large  circle  of  blind 
persons  can  make  use  of  the  reading  aid.  Reading  machines  for  data 
processing  cost  between  $5000  for  a  simple  code  print  interpreter  and 
$100,000  for  a  complex  sorter-reader  installation  of  first  generation  type. 


264  Man-Machine  Systems 

In  the  latter  case,  however,  the  main  costs  are  due  to  the  complex  sorter 
mechanism. 

The  device  for  the  blind  should  be  portable.  The  accuracy  of  auto¬ 
matic  recognition  need  not  be  as  high  as  for  data  processing  application 
because  of  word  redundancy.  Even  if  one  or  two  letters  within  a  word  are 
misread,  its  meaning  remains  understandable  due  to  the  redundant  arrange¬ 
ment  of  letters.  The  question  is  whether  all  requirements  which  must  be 
met  by  a  reading  machine  for  the  blind  can  be  satisfied  without  excluding 
each  other.  The  best  way  of  getting  an  answer  is  to  make  reference  to  the 
problems  of  reading  machines  developed  today  and  compare  them  with 
those  of  a  reading  aid  for  the  blind. 

It  is  significant  for  all  first  generation  machines  that  the  numeric  sym¬ 
bol  font  read  by  them  is  developed  with  a  view  to  machine  recognition. 
This  gives  a  first  reference  point:  that  it  is  not  easy  to  read  normally  shaped, 
printed  script  by  machines  with  relatively  little  expenditure  and  high 
accuracy. 

Figure  2  demonstrates  the  dependance  of  machine  legibility,  human 
legibility,  and  printing  of  possible  character  presentations.  A  character 
presentation  consisting  merely  of  a  mechanical,  dielectric,  magnetic,  or 
optical  code  (such  as  punched  codes,  magnetic  codes,  or  code  prints)  is 


character 


printed  , 
normally 
shaped 


printed , 
symbolized 


printed, 

symbolized 

printed , 
normally 
shaped 

type  written 
hand  printed 
hand  scribed 


code 

arranged 

mechanical 

dielectric 

magnetic 

optical 

magnetic 

optical 

external 

magnetic 

optical 

internal 

evident 

magnetic 

optical 

internal 
non  -  evident 

machine 

legibility 


human 

legibility 


printing 


easy 

k 


Figure  2  Legibility  and  Printing  of  Possible  Character  Presentation 


Complex  Reading  Machines  265 

easily  read  by  machines  although  the  legibility  to  humans  is  very  poor  and 
printing  may  possibly  be  difficult.  A  possibility  for  compromising  between 
human  and  machine  legibility  is  to  print  normally  shaped  characters  in 
addition  to  code  prints;  that  is,  to  duplicate  characters  by  a  bar  code  or 
something  else.  The  other  way  is  to  use  the  character  segments  themselves 
for  representing  a  code.  This  implies  that  the  characters  have  to  be  stylized 
according  to  the  code  chosen.  A  sample  of  such  fonts  is  given  in  Figure  4. 
In  the  case  mentioned  first  the  code  is  external.  In  the  latter  case  two  ways 
of  internal  coding  are  possible.  First,  the  code  may  be  arranged  in  a  man¬ 
ner  that  it  is  evident;  second,  that  it  is  nonevident.  Furthermore,  two  pos¬ 
sible  ways  of  printing  these  character  fonts  are  feasible.  First,  to  print  the 
characters  only  in  optical  contrasts;  second,  to  print  them  both  optically 
and  magnetically.  In  the  latter  case  the  print  requirements  are  generally 
more  stringent  in  regard  to  ink  quality  and  printing  tolerances,  while  the 
machine  legibility  is  better  and  more  constant  in  comparison  with  optical 
scripts  where  avoidance  of  smudging  and  the  like  is  crucial.  Machine  legi¬ 
bility  decreases  from  characters  duplicated  with  a  code  print  to  highly 
stylized  characters  with  a  nonevident  code,  whereas  human  legibility  de¬ 
creases  in  the  opposite  direction.  Machine  legibility  is  still  worse  when  nor¬ 
mally  shaped  printed  and  typewritten  characters  are  concerned.  Finally, 
the  problem  of  reading  handprinted  or  cursive  script  by  machines  is  not 
yet  solved,  at  least  on  the  commercial  level. 

Figure  3  gives  a  small  sample  of  alphanumeric  character  fonts  which 
occur  in  European  newspapers  and  books.  These  are  the  printed  characters 
which  must  be  read  by  a  reading  machine  for  the  blind.  Note  the  immense 
number  of  variations  existing  for  these  scripts — variations  in  scale,  shape, 
structure,  breadth,  height,  line  width,  and  spacing  of  characters.  Despite 
the  many  variations  the  human  eye  and  brain  give  the  same  digital  output 
signal  for  corresponding  characters;  this  is  required  from  the  machine 
as  well.  Furthermore,  the  volume  of  the  described  character  fonts  consisting 
of  capital  and  small  letters,  numerals,  punctuation  marks,  and  symbols  is 
very  large  compared  with  the  character  font  capability  of  a  first  generation 
reading  machine  which  consists  only  of  numerals  and  a  few  symbols,  as 
shown  by  Figure  4. 

Figure  4  gives  a  survey  of  character  fonts  for  first  generation  machines 
available  in  part  on  the  European  market.  The  comparison  with  Figure  3 
makes  evident  the  high  capacity  of  the  human  eye  compared  with  that 
of  existing  reading  machines,  and  shows  at  the  same  time  the  severe  require¬ 
ments  which  must  be  met  by  a  reading  machine  for  the  blind. 


266  Man-Machine  Systems 

Seimag,  the  German  company,  has  developed  a  numerical  character 
font  with  an  external  magnetic  code.  A  magnetic  bar  code  is  arranged 
above  and  below  the  printed,  normally  shaped  numerals  and  symbols.  The 
magnetic  bar  code  is  read  by  a  conventional  magnetic  read  head.  The  bar 
code  elements  are  printed  in  five  positions  for  each  character.  The  upper 
and  lower  first  code  position  is  always  marked  and  serves  for  starting  the 

TIMES  korsiv  T  BCD  H  F  (/'  //  /  J  K  L  M  \  O  P  Q  R  S  T  (  I  H  A  >  / 

a  h  c  il  c  f  ff  U  tl  ,?  h  i  j  k  I  m  n  o  p  q  r  a  i  u  v  tr  ,v  »•  r  li  i  ">  ii 
I  2  3  4  5  6  7HV0  +  .s'  A 


FLTURA  MAGER 


WEISS  ANTIQl  A 
kursiv 


ABCDEFGHIJKLMNOPQR  STU 
abcdefg  h  ijklm  nopqrsftuvwxy 

1234567890  ®  oe  1 1  »«  S  *  t  & 

JJliCDEfgnJJJCLMMOPQ 

aln  lietjf  ijh  i  jklmtn-,n  tL.op  q  rst  L^uv  w  xy  z  doit 
12  J45b7H90  jf'Ji  fl  jt  cb  ckfl  CT  a  ?  ()  0 


ABCDKFGHIIKLM  NOPQR  STUVVVXVZ 
«i  b  i:  (1  h  I  g  h  i  i  k  I  in  n  u  p  q  r  s  t  u  v  w  x  y  z  ii  ii  ii 
12H45H7890  ih  (h  II  a?  m ‘ -|  |& 


GILL 


ROCKWELL  FETT 


FETTE  ANTIQUA 


Figure  3 


ABCDEFGHIJKLMNOPQRSTUVW 

abcdefghijklmnopqrstuvwxyzaouBae 

1234567890 

ABCDEFGEIJKLMNOPQRS 
abcdefg  hijklranopqrstnvw 

1234567890  fiflfiaeoe. 


ABCDEICiHl  JKLlflMO 
abode  fg  hijklmnopq 
1234507 BftO  .:;!?’()8fA 

Examples  of  Printed  Alphanumeric  Character  Sets 


recognition  logic.  The  values  1,  2,  4,  and  8  are  coordinated  to  the  next 
upper  code  positions,  representing  the  code  words  which  are  the  numerals 
0  to  9  and  a  few  symbols.  The  lower  positions  are  used  for  the  complements. 
The  weighted  positions  are  either  marked  above  or  below  the  character, 
but  not  simultaneously  in  both  bar  rows,  thereby  making  the  code  pres¬ 
entation  and  thus  the  character  presentation,  self-checking.  Another  ad¬ 
vantage  is  that  the  characters  are  self-starting  and  self-timing  since  each 
code  position  is  marked  by  a  bar  above  or  below  the  character.  Thus  the 
recogition  does  not  depend  on  the  document  velocity.  This  character  font 
allows  the  presentation  of  24  or  16  possible  characters. 


Complex  Reading  Machines  267 


The  character  fonts  developed  by  Electrical  and  Musical  Industries 
(EMI)  of  England  and  by  Bull  of  France  have  an  internal  evident  code  for 
character  presentation  printed  with  magnetic  ink.  With  the  EMI  font  five 
horizontal  positions  have  to  be  checked  with  regard  to  the  vertical  sum  of 
magnetic  ink.  Two  steps  have  to  be  discriminated.  Expressed  in  terms  of 
printing  ink,  one  has  to  discriminate  between  mainly  black  or  mainly  white 


SI  EM  AG 


EMI 


I  II  II  III  II  II  I  I  II  Mil  II  II  I 


01  23456789 

i  ii  mi  mi  ii  i 

III  I  lllll  I  II  I  III 

L  B  K  W 


inn  i  iii  ii  ii  i  ii  iii  i  ill  ii  i  ii  mi  i  ii 

III  I  Mill  I  II  I  III 


ii  i  in  n 


©  1  2  J  4  5  6  1  8  9  10  11 !  i  n>  i1  ! 


BULL- CMC -7 


ABA-E-I3B 


i"l||  urn 

.  ,  i'11  it 

linn  Ulii 


I  .I"1,,  ii::::: 

I  llm 


mi 


mu 
i  ii  n 


hi 


in 


0  L 


1  i1  i" 
I1  1  ■ 


III 


— »  D  125  456789 


01  2  ]  <5  b  7  8  =1 


Figure  4  Examples  of  Numeric  Character  Sets  for  First  Generation  Read¬ 
ing  Machines 


control  areas.  The  first  black  area  serves  for  starting  the  logic.  Four  control 
areas  remain  for  representing  16  possible  characters. 

The  characters  of  the  Bull-CMC-7  font  are  composed  of  seven  dis¬ 
continuous  strokes.  The  six  stroke  distances  are  the  recognition  criteria. 
Only  two  different  distances  (long  and  short)  are  discriminated.  Each 
character  composition  possesses  only  two  long  stroke  distances,  thereby 
making  this  code  self-checking.  It  is  possible  to  present  2  out  of  6  at  a  time, 
or  15  characters. 

The  most  significant  and  well-proved  magnetic  ink  character  set  is  the 
E-13B  font  developed  by  the  American  Bankers  Association.  The  code  is 
internal  and  nonevident.  As  with  the  EMI  font,  each  character  is  composed 


268  Man-Machine  Systems 

of  a  characteristic  vertical  sum  of  magnetic  ink  along  the  horizontal  axis 
which  is  the  reading  direction. 

The  CZ-13  font  has  been  developed  by  the  German  company  Standard 
Elektrik  Lorenz  of  ITT.  The  presented  character  set  is  an  optical  numeric 
symbol  font  with  an  internal  evident  code.  The  symbols  are  omitted  here. 
The  vertical  line  segments  of  the  character  constitute  a  code.  The  char¬ 
acter  area  is  dissected  into  five  upper  and  five  lower  control  areas  which  may 
be  covered  by  a  vertical  line  segment.  Because  of  the  habitual  shape  of 
numerals  only  a  few  possibilities  remain  in  designing  the  ten  numerals.  It 
has  to  be  mentioned  in  this  connection  that  the  assurance  of  machine  legi¬ 
bility  is  maintained  by  the  smallest  number  of  code  elements,  in  this  case 
vertical  line  segments,  of  a  character  which — when  altered — yield  a  new 
character.  This  minimum  number  of  code  elements  or  minimum  area  of 
character  lines  which  are  not  common  when  examining  any  two  characters 
of  the  set  is  called  Minimum  Hamming  Distance,  abbreviated  MHD  in 
code  theory.  The  MHD  is  actually  a  measure  for  the  power  of  discrimina¬ 
tion  in  the  worst  case,  when  referred  to  the  total  number  of  code  elements 
respectively  in  the  total  character  area.  Here  the  MHD  of  the  CZ-13  font 
is  two,  underlying  a  total  of  ten  code  elements,  namely  ten  vertical  line 
segments. 

Finally,  Figure  4  demonstrates  a  preliminary,  highly  stylized  optical 
font  designed  on  the  base  of  a  construction  grid  consisting  of  quadratic 
elements  of  five  column  and  nine  row  elements.  This  font  design  is  quali¬ 
fied  both  for  first  and  second  generation  machines  using  a  stroke  analysis 
or  a  matrix  matching  recognition  scheme.  The  code  is  internal  and  non- 
evident.  The  criterion  for  machine  recognition  is  the  arrangement  of  verti¬ 
cal  and  horizontal  line  segments  of  characters. 

Figure  5  gives  a  listing  of  reading  machines  of  the  first  and  second 
generation  type  known  in  Europe  and  which  are  in  part  commercially 
available.  For  bank  accounting,  and  especially  in  Germany  for  the  postal 
check  service,  a  magnetic  ink  sorter-reader  equipment  is  distributed  by 
companies  like  Bull,  IBM,  Burrough,  National  Cash  Register/Pitney 
Bowes,  National  Data  Processing,  and  General  Electric/Telefunken, 
whereas  optical  equipment  is  distributed  by  IBM  and  Standard  Elektrik 
Lorenz  of  ITT.  All  these  machines  are  of  first  generation  type  and  destined 
for  automatic  accounting  together  with  a  computer  and  a  storage  medium 
for  account  holding.  Only  numeric  symbol  fonts  are  used. 

On  the  other  hand,  smaller  banks  may  use  readers  without  sorters  at 
moderate  cost  for  bookkeeping  with  cards,  punched  cards,  and  the  like. 


Complex  Reading  Machines  269 

These  machines  are  distributed  by  German  companies — by  Siemag  for 
magnetic  ink-printed  characters  and  by  Ruf/Hasler  and  Zeiss  Ikon  for 
optical  characters.  Tally  roll  readers  have  been  developed  by  Solartron/ 
England  and  by  Sweda.  In  addition,  Standard  Elektrik  Lorenz  developed  a 
reader  for  Pica  numerals  of  a  typewriter.  Siemens  and  our  Institute  at  the 
Karlsruhe  Technical  University  have  developed  character  readers  for  the 


machine 

read 

contrast 

coda 

font 

type 

company 

model 

recognition 

magne  ti  o 

•  x  ternal 

numeric 

Slecag 

'  iemag 

Visiomat 

> 

internal 

evident 

CMC-7 

Bull 

TL  9oo 

►  code  inter- 

non-evident 

EKI 

Electrical 
Musical  Ind. 

FRED 

|  pretation 

E-13B 

1 31a 

1210/1412 

matrix 

1219/1419 

correlation 

E-1J3 

Burroughs 

B  lol 

E-13B 

JJCR/Pi  tney 

Bowes 

4o2 

E-1JB 

I«  DP 

TOP  2ol 

> linear 
correlation 

E-13B 

ce/ 

Telefunken 

j 

optical 

external 

numeric 

E-3 

Sweda 

E-3 

Self oheck 

Addreseo- 

graph 

> code  inter- 

internal 

|  pretation 

evident 

CZ-13 

S tandard 

CZ-15 

Elektrik 

J 

non-evident 

Easier 

Ruf-Hasler 

Solartron 

ERA 

matrix 

Pica 

o tandard 
Elektrik 

ZL  57 

"  correlation 

IBM  4o7 

IBM 

1418 

variable 

Siemens 

l  shape 

variable 

T.H. 

Karloruhe 

I  analysis 

alpha- 

Selfcheck 

Farrington 

Optioal- 

stroke 

numeric 

Scanner 

inalyeli 

IBM 

1428 

Rabinow 

HER 

natrlx 

1 

Baird 

Atomics 

correlation 

Figure  5  Survey  of  Reading  Machines  for  Data  Processing 


simultaneous  reading  of  typewritten  numerals  of  various  fonts  which  may 
be  applied,  for  example,  to  read  the  four  digit  numeric  address  code  on 
letters  in  Germany.  Known  through  the  literature  are  the  code  reader  of 
Addressograph-Multigraph;  reading  machines  for  alphanumeric  symbol 
fonts  by  Farrington,  which  distributes  both  a  card  and  a  page  reader;  IBM; 
Rabinow;  Baird-Atomic;  and  the  announced  readers  of  RCA,  Philco, 
Remington  Rand  and  other  companies.  Some  of  the  latter  machines  must 
be  ranked  with  the  second  generation  machines,  most  often  the  character 
presentations  are  highly  stylized.  The  recognition  schemes  are  in  the  simplest 
case  merely  a  code  interpretation  or  linear  correlation,  stroke  analysis, 
matrix  correlation,  mask  matching,  or  a  shape  analysis;  this  latter  requires 
a  statement  of  shape  elements  together  with  direction  and  position.  More- 


270  Man-Machine  Systems 

over,  English  scientists  are  engaged  in  special  optical  correlation  methods 
and  shape  analysis;  shape  analysis  is  studied  at  Bull  in  France;  Russian 
scientists  are  investigating  topological  recognition  methods  which  require 
a  statement  of  character  line  crossings  and  terminal  lines;  and  Italian 
scientists  are  engaged  in  perceptron-like  types  of  recognition. 

After  this  summary  of  known  reading  machines  and  methods  on  the 
one  hand,  and  of  possible  character  presentation  on  the  other,  the  possibil¬ 
ities  of  realizing  a  reading  machine  for  the  blind  can  be  discussed.  The 
statements  below  are  based  on  the  assumption  of  a  matrix  correlation 
method  especially  apt  for  estimating  the  expenditure. 

After  a  character  is  scanned  the  character  is  stored  as  a  whole  in  a 
matrix-like  reproduction  storage.  A  character  is  represented  by  electrical 
means  quantized  to  white  and  black  information  and  quantized  for  areal 
to  square  matrix  elements;  this  means  the  character  is  presented  in  a  raster. 
A  matrix  correlation  method  is  very  adaptable  to  the  formation  of  recog¬ 
nition  logic  when  character  fonts  of  large  volume  and  varying  fonts  have 
to  be  recognized.  Furthermore,  each  electrically  reproduced  character  may 
be  shifted  into  a  best  matching  position  with  the  wired  ideal  character 
speciman  before  the  process  of  recognition  begins.  This  is  necessary  be¬ 
cause  of  possible  misregistration  of  characters.  The  height  of  a  character 
row  is  not  fixed  since  tolerances  of  printing  and  allowances  for  paper 
transport  occur.  Concerning  the  recognition,  the  mask  matching  need  not 
be  uniformly  weighted  over  the  whole  matching  area.  Definite  matrix  ele¬ 
ments  which  are  covered  with  character  segments  of  high  information 
capacity  may  be  weighted  higher  as  to  the  mask  matching  process.  Finally, 
recognition  schemes  such  as  shape  analysis,  which  is  especially  qualified 
for  a  reading  machine  for  the  blind,  generally  require  an  intermediate  re¬ 
production  storage  as  well.  Shape  analysis  schemes  claim  to  be  more  eco¬ 
nomical  in  logic  than  matrix  correlation  when  regarding  many  character 
fonts  because  of  detecting  similar  and  slightly  varying  shapes. 

Optical  correlation  methods  are  considered  to  be  of  less  importance.  The 
exposition  and  registration  by  optical/mechanical  means  is  unrealistic  in 
view  of  the  large  number  of  different  character  masks,  whereas  an  optical 
method  with  partial  correlation  is  more  sensitive  to  misregistration;  inte¬ 
grating  optical  methods  yield  a  very  poor  discrimination  among  characters 
when  the  volume  is  very  large. 

Before  feeding  the  character  features  to  the  recognition  logic  a  char¬ 
acter  has  to  be  reproduced  electrically  by  an  intermediate  storage.  The 
number  of  storage  elements,  which  are  of  course  identical  with  the  number 


Complex  Reading  Machines  27 1 

of  matrix  or  raster  elements,  has  a  crucial  influence  on  the  machine  cost, 
hence  the  resolution  used  in  scanning  a  character  is  not  allowed  to  be 
greater  than  necessary.  Figure  6  demonstrates  the  dependence  of  contrast 
on  the  diameter  D  of  a  scanning  spot  when  crossing  a  character  line  of 
line  width  d  rectangularly.  On  the  left-hand  side  the  contrast  is  shown  when 
assuming  a  square  diaphragm;  on  the  right-hand  side  the  contrast  is  shown 


/ 


Figure  6  Contrast  of  Character  Line  Scanning  as  a  Function  of  Resolution 

when  assuming  a  circular  diaphragm  or  a  scanning  light  spot.  The  contrast 
in  scanning  is  drawn  as  a  function  of  spot  location  in  relation  to  the  center 
of  the  character  line.  The  contrast  is  given  by  the  ratio  of  scanned  black 
portions  of  a  character  line  to  the  whole  area  of  the  scanning  spot,  which 
depends  on  the  ratio  of  spot  diameter  D  to  line  width  d.  In  order  to  achieve 
an  optimum  contrast  the  spot  diameter  D  should  be  smaller  than  the  char¬ 
acter  line  width  d,  as  is  evident  when  comparing  the  curves  of  ratio  D/d 
when  equal  to  1  and  3.  The  discrimination  between  white  paper  back¬ 
ground  and  black  character  line  is  accomplished  by  a  threshold  circuit 
adjusted  to  some  particular  threshold  value  (of  0.5,  for  example).  As¬ 
suming  a  constant  noise  caused  by  printing  and  by  photoelectrical  trans¬ 
ducing  the  slopes  of  the  contrast  curves  are  important  in  the  range  of  the 
threshold  level  because  of  the  necessity  for  correct  triggering  of  the  dis¬ 
criminator  circuit  for  black  information.  Here  one  must  compromise  be- 


272  Man-Machine  Systems 

tween  steep  slope  and  adequate  resolution.  A  compromise  is  given  for  a 
slope  angle  of  1  to  V2  the  line  width  d.  This  yields  ratios  of  spot  diameter 
to  line  width  of  0.5  and  0.64  for  square  versus  circular  scanning  spots. 

Figure  7  gives  a  Pica  typewriter  character  font  of  relatively  thin  line 
width  which  is,  therefore,  apt  for  estimating  the  required  resolution  for  a 
reading  machine  for  the  blind.  The  character  spacing  is  10  characters  to 

Pica  Type  Font 

ABCDEFGHIJK1MN0PQRSTUVWXYZ 

abcdefghijklmnopqrstuvwxyz 

1234567890 

X0tJ£?6&§+-«/,  .  ;  s  !  ?"() 

spacing  :  10  characters  per  26  mm 

height  :  2, 6  mm 

line  width  d  :  nearly  character  height  =  0,17  nan 


resolution  : 

spot  diameter  D  =  d 

15  x  23  =  345 

spot  diameter  D  =  0,64  d 
23  x  35  =  805 


Figure  7  Example  of  an  Alphanumeric  Type  Written  Pica  Character  Font 
and  Required  Resolution 


each  26  mm  and  the  height  is  2.6  mm.  The  character  line  width  is  nearly 
1/15  of  the  height  (nearly  0.17  mm).  The  area  which  must  be  scanned  per 
character  has  to  be  larger  in  height  than  the  character  height  because  of  the 
possibility  of  misregistration.  The  minimum  is  a  zone  of  half  a  character 
height  for  misregistration  in  addition  to  the  character  height  itself.  This  re¬ 
quires  a  rectangular  zone  to  be  scanned  per  character  of  2.6  mm  by  3.9  mm. 
Assuming  a  resolution  (according  to  Figure  6,  with  spot  diameter  to  line 
width  ratio)  of  1.0  or  0.64,  requires  nearly  350  to  800  matrix  elements. 

Figure  8  gives  a  survey  of  different  ways  of  scanning  a  character.  The 
document  which  is  imprinted  with  characters  is  illuminated  and  the  charac¬ 
ters  are  directed  by  a  lens  system  on  to  one  or  an  array  of  phototransducers. 
The  character  image  is  shifted  across  the  phototransducer  in  a  horizontal  di- 


Complex  Reading  Machines  273 

rection  by  the  mechanical  document  transport.  The  vertical  scanning  of 
matrix  columns  is  accomplished  either  by  a  rotating  disk  slit  system  or  elec¬ 
tronically  by  a  deflected  electron  beam  from  a  TV  camera  tube  (a  vidicon 
for  example),  a  dissector  tube,  or  by  a  linear  arrangement  of  photo  trans¬ 
ducers.  Thirty  transducers  are  necessary,  for  example,  for  the  required  reso¬ 
lution.  Finally,  for  a  flying  spot  scanner,  the  light  spot  on  the  screen  of  a 


scanners 

features 

1  me  ir 

arrangements 
of  foto- 
traneducers 

flying 
a:  ot 

di J9ector 
tube 

scanning 

disk 

TV  camera 
tube 

t 

data 

procensin j 

laportinca 

In  re.'ix d  to 

aid  for  t.-.e 
blind 

1 

coats  of 
co3ii>onen  ts 

10 

( 

6 

5 

7 

hi*h 

speed 

+ 

+ 

+ 

+ 

— 

sui table 
lifht 
spec trua 

+ 

— 

— 

+ 

— 

no  mechanical ly 
moved .parts 

+ 

+ 

+ 

— 

+ 

tolerances , 
a^inr 

— 

+ 

+ 

+ 

+ 

no  darkening 

+ 

— 

+ 

+ 

+ 

solid  state 
components 

+ 

— 

— 

— 

— 

low  voltage  , 

+  ' 

— 

— 

— 

+ 

cost  of 
components 

7 

4 

6 

j 

6 

Figure  8  Comparison  of  Scanner  Types  in  Regard  to  Data  Processing  and 
to  an  Aid  for  the  Blind 

cathode  ray  tube  is  deflected  by  electrical  means  and  directed  on  to  the 
darkened  document  by  a  lens  system.  The  requirements  which  are  import¬ 
ant  for  data  processing  are  arranged  in  the  upper  part  of  Figure  8,  and  those 
important  for  a  reading  machine  for  the  blind  are  arranged  in  the  lower 
part.  High  speed  scanning  is  accomplished  by  an  array  of  photodiodes.  The 
vidicon  yields  the  lowest  scanning  rates,  about  50  pictures  per  second  due 
to  the  charge  storage  effect.  Flying  spot,  dissector,  and  vidicon  tube  are  re¬ 
stricted  somewhat  in  application  because  of  the  fixed  spectrum  of  blue  screens 
for  short  decay  times  of  phosphorescence  for  photo  cathodes  in  comparison 
to  solid  state  phototransducers.  The  disadvantage  of  a  scanning  disk  is  the  me¬ 
chanical  drive  required.  The  scattering  of  data  and  the  aging  of  solid  state 
components  of  the  photodiode  array  interfere  with  both  applications.  Due 
to  the  required  smallness  of  the  phototransducers  within  an  array  and  due 
also  to  the  frequency  limit  only  photodiodes  for  data  processing,  and  photo 
resistors  for  reading  aids,  can  be  considered.  A  disadvantage  of  the  flying 
spot  scanner  is  the  darkening  of  the  documents.  The  photo  transducer  array 
can  be  realized  with  solid  state  components  only.  This  implies  low  voltages, 


274  Man-Machine  Systems 

in  contrast  to  scanners  requiring  cathode  ray  tubes  or  photomultiplier  tubes 
which  require  voltages  from  800  to  25,000  volts.  Regarding  only  compo¬ 
nents  the  net  costs  of  a  photodiode  scanning  array  for  data  processing  are  the 
highest,  whereas  the  scanning  disk  for  a  reading  machine  for  the  blind  can 
be  realized  at  modest  cost.  It  is  my  opinion  that  for  this  application  a  scan¬ 
ning  disk  working  with  high  accuracy  is  most  reasonable.  Because  of  the 
relatively  long  scanning  time  per  character  (of  about  1  second  for  10  or  20 
characters),  200  microseconds  for  each  matrix  element  is  within  reach;  a 
photoresistor  may  be  used  for  transducing.  The  only  disadvantage  is  perhaps 
the  mechanical  drive  of  the  scanning  disk. 

A  two  dimensional  array  of  phototransducers  is  not  included  in  Figure 
8.  A  phototransducer  array,  in  addition  to  an  intermediate  storage,  is  not 
justified  because  of  the  higher  expenditure.  On  the  other  hand,  when  the 
logic  is  coupled  directly  to  the  transducers  without  intermediate  storage  the 
misregistration  of  characters  in  height  and  also  the  character  centering  may 
not  be  accomplished  without  additional  means.  Furthermore,  the  disad¬ 
vantageous  influence  of  component  scatter,  aging,  and  the  costs  increase  by 
the  quadratic  power  with  the  growing  number  of  phototransducers. 

The  reproduction  storage  generally  consists  of  transistor  flip-flops  work¬ 
ing  as  a  shift  register  with  two  shift  directions. 

Figure  9,  for  example,  demonstrates  one  way  of  reproducing  a  character. 


n 

MHD-3 

MHD  ’4 

MHD  -22 

3 

2 

- 

4 

2 

2 

5 

4 

2 

6 

8 

4 

7 

16 

8 

8 

16 

16 

9 

32 

16 

10 

64 

32 

11 

128 

64 

12 

1 

1 

256 

1 

1 

128 

1 

1 

! 

1 

1 

1 

44 

1 

1 

1 

i 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

i 

88 

1 

1 

1 

character  font  : 

small  letters 
capital  letters 
numerals 

punctuation  marks 
symbols 


number  of 
raster  elements  •• 


N  **  350  ...  800 


more  than  80  characters 


number  of 
checked  elements  ■ 


n  ***  100 


Figure  9  Discrimination  Between  Characters  of  an  Alphanumeric  Font 


Complex  Reading  Machines  275 

Each  matrix  element  of  the  resolved  area  corresponds  to  a  flip-flop  store. 
As  is  well  known,  the  store  consists  of  two  transistors  fed  back  in  such  a  way 
that  only  one  transistor  is  conductive  while  the  other  is  forced  to  be  not 
conductive.  This  yields  two  different  stable  states:  the  information  “black” 
or  “white”  may  be  linked  to  one  or  another  stable  state  at  will.  For  clarity 
only  the  marked  stores  are  represented  in  Figure  9.  Now  the  stored  infor¬ 
mation  or  character  may  be  shifted  as  a  whole  to  the  right-hand  side  and 
downwards,  for  example,  so  that  the  electrically  reproduced  character  may 
be  matched  with  the  recognition  logic,  even  when  misregistrated  with  re¬ 
spect  to  matrix  columns  and  rows.  This  process  is  called  centering.  Accord¬ 
ing  to  the  required  resolution,  here  about  500  matrix  elements,  the  same 
number  of  flip-flop  stores  must  be  provided.  A  data  reduction  between  scan¬ 
ner  and  intermediate  store,  in  order  to  reduce  the  expenditure  for  storage, 
is  hardly  possible  because  the  resolution  already  has  been  kept  low  com¬ 
pared  with  the  large  volume  of  character  fonts. 

The  recognition  logic  and  its  circuits  have  to  be  fixed  in  definite  areas 
within  the  reproduction  storage.  They  have  to  indicate  black  information 
(character  segments  which  cover  certain  areas).  To  provide  the  greatest 
efficiency  these  control  areas  have  to  be  chosen  in  such  a  manner  that  as 
few  characters  as  possible  have  the  same  control  areas. 

How  many  control  areas  are  necessary  to  distinguish  among  all  the 
characters  of  an  alphanumeric  character  font  of  more  than  80  characters, 
consisting  of  small  and  capital  letters,  numerals,  punctuation  marks,  and 
symbols?  One  may  interpret  the  control  areas  within  the  character  field 
as  code  elements,  and  the  quantized  character  as  a  code  word  of  a  binary 
code,  where  0  means  no  covering  and  1  means  covering  of  a  control  area, 
for  example.  The  question  is,  how  many  code  elements  are  necessary  to 
present  80  code  words  when  each  code  word  differs  from  every  other  code 
word  by  at  least  a  given  number  of  code  elements?  Comparing  the  numeral 
8  with  the  capital  letter  B  in  Figure  9  shows  that  the  assumption  of  an 
MHD  value  of  4,  i.e.,  four  control  areas,  which  yield  a  different  statement 
when  comparing  the  numeral  8  with  capital  B,  is  justified  and  necessary 
for  an  unobjectionable  discrimination.  The  table  in  Figure  9  gives  the 
results  of  code  theory.  For  the  discrimination  of  80  to  128  characters, 
12  control  areas  are  necessary  when  an  MHD  value  of  4  is  assumed. 
In  practice,  more  than  12  control  areas  are  required  since  the  shape  of 
characters  has  grown  naturally  over  many  centuries  and  has  not  been  de¬ 
signed  with  regard  to  code  theory.  Characters  do  not  allow  a  maximum 
discrimination  and  so  they  cannot  be  pressed  into  12  control  areas.  The 


276  Man-Machine  Systems 

other  extreme  is  given  by  four  control  areas  for  each  character,  where  no 
control  area  of  one  character  can  be  used  for  the  other.  In  this  case  4  times 
80,  or  320,  control  areas  would  be  necessary.  Certainly  the  practical  case 
is  closer  to  12  control  areas.  The  code  theory  gives  an  MHD  value  of  22 
for  a  maximum  of  88  possible  code  words,  represented  by  44  code  ele¬ 
ments.  So  one  should  assume  about  30  control  areas  for  one  alphanumeric 
font.  Since  the  matching  is  not  allowed  to  lead  to  misrecognition  due  to 
misregistration  caused  by  quantization  errors  and  noise  the  control  area 
must  cover  more  than  only  one  matrix  element.  The  minimum  number  of 
matrix  elements  is  about  three.  The  matrix  elements  of  such  an  area  are 
connected  by  “or”-gates  in  the  simplest  case.  This  means  a  control  area, 
which  consists  of  three  matrix  elements,  signals  a  covering  by  a  character 
segment  when  at  least  one  matrix  element  is  covered.  So  the  number  of 
matrix  elements  to  which  the  recognition  logic  must  be  fixed  is  in  the  region 
of  at  least  100  for  one  alphanumeric  font. 

The  feature  statement  of  matching  achieved  by  means  of  the  recogni¬ 
tion  logic  has  to  be  decoded  to  give  the  character  read.  This  may  be  done 
by  a  code  interpreter  consisting  mainly  of  resistors,  and  having  thresholds 
for  each  representative  character. 

Finally,  Figure  10  gives  a  summary  of  the  described  and  the  functional 
parts  of  a  reading  machine  for  the  blind. 

A  statement  is  given  also  about  the  estimated  costs  of  components  of 
each  functional  part.  All  costs  are  lowest  amounts.  The  rural  costs  are  cer¬ 
tainly  higher  since  no  costs  of  development  and  production  are  considered. 


reading  maohine  for  the  blind  coate 


meohanical  traneport 

manual/mechanical 

10 

character  scanner 

scanning  disk, 
spot  diameter 

1  to  o,6  line  width 

> 

10 

intermediate  storage 

transistor  shift  register 
nearly  5oo  bits 

> 

100 

logical  cirouits 

resistor  logic  connected 
to  nearly  loo  matrix 
elements  of  storage 

> 

30 

spelled  speech  output 

mechanical/acoustical 

50 

total 

>  200 

average  monthly 

in  USA 

10 

employee 

in  Germany 

3 

Figure  10  Estimated  Costs  of  Components  of  a  Reading  Machine  for 
the  Blind 


Complex  Reading  Machines  277 

The  highest  costs  are  due  to  the  intermediate  storage  consisting  of  about 
500  flip-flop  stores  for  character  reproduction.  These  costs  may  be  ex¬ 
pressed  as  100  arbitrary  units.  They  are  followed  by  30  units  which  must 
be  assumed  for  the  components  of  the  logic,  and  by  10  units  which  must  be 
counted  for  the  scanning  device  based  on  a  revolving  disk  and  slit  system. 
The  mechanical  transport  for  the  reading  material  can  possibly  be  realized 
for  10  units.  One  has  to  assume  a  manual  mechanical  working  of  this  device, 
running  through  a  character  row,  variable  adjustment  of  row  to  row  spacing, 
adaptation  to  the  reading  material,  adjustment  of  the  scale  of  the  optical 
system  when  varying  character  heights  occur,  and  the  like. 

Regarding  an  output  of  spelled  speech,  the  costs  might  be  comparable 
to  those  for  an  electrical  typewriter  provided  the  output  is  realized  by 
mechanical  means.  The  costs  are  estimated  at  50  units.  It  is  remarkable 
that  fewer  characters  have  to  be  read  for  acoustical  output  than  for  char¬ 
acter  recognition.  The  discrimination  between  capital  and  small  letters  is 
irrelevant  for  the  speech,  in  contrast  to  character  recognition  where  the  same 
capital  and  small  letter  are  each  represented  by  a  different  pattern  (it  is 
possible  to  indicate  capital  letters  by  a  greater  sound  volume  in  the  acousti¬ 
cal  output). 

According  to  this  assessment,  the  costs  of  a  reading  machine  for  the 
blind  total  to  more  than  200  arbitrary  units;  expressed  in  dollars,  more 
than  $5000.  Yet  no  costs  of  development  and  production  are  considered. 
Compared  with  the  average  monthly  income  of  a  worker  the  costs  of  com¬ 
ponents  are  more  than  70  times  as  high  as  the  income  of  a  German  citizen, 
or  20  times  those  of  a  U.  S.  citizen. 

In  this  estimate  of  costs  only  one  character  font  was  assumed  to  be 
recognized  automatically.  If  more  than  one  font  (say,  for  example,  five  or 
six)  is  to  be  read,  the  estimated  costs  of  the  logic  will  double.  One  has  to 
admit  that  today  the  costs  of  a  complete  reading  machine  for  the  blind  are 
still  too  high,  even  if  government  subsidies  are  granted,  to  become  a  real 
aid  for  the  blind.  The  possibility  of  portability  cannot  be  realized  due  to 
the  large  number  of  components. 

It  has  been  postulated  above  that  only  a  reading  machine  can  satisfy 
the  requirements  which  have  to  be  set  for  a  reading  aid  for  the  blind  in 
the  long  run.  There  remains  the  question  of  whether  reading  machines  will 
obtain  importance  in  the  future.  In  my  opinion  the  answer  is  yes,  even 
though  the  problem  of  automatic  character  recognition  is  not  yet  solved 
(this  is  obvious  considering  the  numerous  reading  machines  of  first  gen¬ 
eration  type,  or  comparing  the  efficiency  of  the  human  eye  and  brain  with 


278  Man-Machine  Systems 

reading  machines  of  the  second  generation  type).  In  the  future  costs  may 
be  lowered  by  components  becoming  cheaper  or  by  new  ideas  of  perception¬ 
like  recognition  and  artificial  intelligence.  The  transport  problem  of  the 
device  could  be  solved  by  progress  in  micromodule  techniques  which  imply 
economy  in  volume  and  weight. 

In  conclusion,  I  should  point  out  that  any  statement  on  reading  ma¬ 
chines  for  the  blind  should  be  particularly  objective  and  critical  towards 
the  problems  involved,  even  though  such  a  report  might  seem  somewhat 
pessimistic.  Otherwise,  hopes  spring  into  being  which  cannot  be  fulfilled — 
as  one  may  learn  from  the  many  letters  written  by  blind  persons  who  read 
items  in  the  popular  press  exaggerating  the  capabilities  of  present-day 
engineering  accomplishments  in  reading  machines,  especially  in  view  of 
the  cost  and  production  problems  involved. 


VISUAL  PATTERN  RECOGNITION: 


THE  PROBLEMS  AND  PROMISE 

OLIVER  G.  SELFRIDGE 

Lincoln  Laboratory *  of  Massachusetts  Institute  of  Technology,  Lexington, 
Massachusetts 


I  shall  make  remarks  here  of  a  fairly  general  nature.  I  think  on  the  whole  I 
feel  very  optimistic  in  the  very  long  run  of  the  abilities  of  our  technology 
to  aid  the  blind  in  pattern  recognition;  and,  agreeing  with  Herr  Kaz- 
mierczak,  not  very  optimistic  about  the  immediate  prospects. 

The  astonishing  uniqueness  of  man  in  general  I  think  depends  to  an 
enormous  extent  on  his  perception  and  integration  through  his  assorted 
senses  of  the  world  around  him.  I  can  see  that  it  is  the  job  of  scientists, 
among  others,  to  find  out  how  they  work  and  the  job  of  technology  to 
improve  them  prudently  and  appropriately.  The  power  of  our  eyes,  of 
course,  is  not  in  responding  to  light,  but  in  seeing  and  recognizing  patterns. 
The  blind  man  does  not  necessarily  need  eyes  alone;  there  is  a  horseshoe 
crab  common  on  our  East  shore  coasts  called  limulus,  which  has  eyes  which 
seem  to  work  very  well  neurologically,  except  that  nobody  has  ever  been 
able  to  find  out  whether  they  influence  his  behavior  at  all.  It  is  not  mere 
photosensitivity,  either;  as  we,  this  time  of  year,  dutifully  trudge  to  the 
beaches  where  we  find  horseshoe  crabs,  we  realize  anew  that  we  are  all 
covered  with  a  splendidly  photosensitive  layer.  The  time  constant  is  a  little 
long  for  some  purposes. 

Nor  is  it  sufficient  to  change  the  light  signals,  scanned  in  some  way, 
merely  to  something  else  like  sound.  Eyes  and  ears  have  great  powers,  but 
they  are  very  different  in  the  ways  they  work,  and  in  the  kind  of  patterns 
they  respond  to.  It  is  a  chief  thesis  here  that  perhaps  the  people  working  on 
reading  machines,  myself  included  of  course,  have  been  insufficiently 
sensitive  to  the  nature  of  the  patterns  which  have  to  be  detected  and  the 
nature  of  the  patterns  to  which  the  other  senses  are  most  sensitive.  What 
the  blind  need  is  a  sensible  level  of  decisions  about  the  light  patterns  that 
the  sighted  see. 

*  Operated  with  support  from  the  U.  S.  Army,  Navy,  and  Air  Force. 


279 


280  Man-Machine  Systems 

This  is  not  the  same  as  saying  that  they  need  merely  photosensitivity. 
The  decisions  about  the  visual  patterns  ought  to  provide  a  balance  of  what 
is  most  useful  and  what  is  most  feasible.  Pattern  recognition  in  general,  of 
course,  is  not  only  concerned  with  visual  patterns,  but  also  with  the  ab¬ 
stractions  we  make  from  the  data  derived  from  all  our  senses.  What  matters 
to  us  in  hearing  speech  is  the  patterns  of  the  words  rather  than  the  individual 
sounds.  In  general  we  prefer  to  go  higher  rather  than  lower  in  levels  of 
decisions  when  we  can.  Sometimes  it  is  easy  to  specify  an  appropriate  set  of 
patterns  which  we  are  going  to  consider — a  vocabulary,  as  it  were,  or  an 
alphabet — or  at  least  to  narrow  the  choice.  For  reading  machines,  we  may 
guess  that  we  ought  to  present  decisions  about  the  letters  or,  more  prob¬ 
ably,  about  the  words.  For  other  kinds  of  problems,  as  in  navigation  or 
guidance  for  the  blind,  there  has  been  much  too  little  exploration  of  the 
kinds  of  decisions  about  the  environment  which  the  blind  man  really  needs, 
the  kinds  of  patterns  which  he  really  should  respond  to.  For  that  matter, 
we  don’t  know  what  sighted  people  really  respond  to  in  looking  at  the 
environment.  It  clearly  is  not  particular  bits,  that  is,  particular  elements  in 
particular  places  in  his  field  of  view.  I  shall  come  back  to  this  later. 

One  of  the  things  that  makes  recognition  of  braille  very  easy  for  a 
computer  (and  I  suspect  far  from  ideal  for  people)  is  that  it  is  not  a  case 
of  pattern  recognition  at  all,  insofar  as  I  am  using  the  term.  There  are 
just  6  bits,  which  correspond  to  26,  or  64  choices.  Each  bit  is  a  very  dis¬ 
tinct  bit  (I  am  using  “bit”  in  the  technical  sense,  that  is,  a  single  binary 
choice).  But  the  usual  case  of  visual  pattern  recognition  involves  thousands 
of  millions  of  bits.  It  is  not  just  a  case  then  of  detection  of  a  few  fixed  ele¬ 
ments,  but  the  recognition  of  complex  variable  patterns  of  thousands  or 
millions  of  elements.  In  the  patterns  of  many  of  them,  some  can  be  altered 
without  changing  the  pattern,  while  to  change  one  of  the  elements  in  a 
braille  character  is  to  change  the  character  absolutely  and  unequivocably. 
It  is  only  natural  for  us  to  be  concerned  chiefly  with  reading  machines  at 
the  most  direct  and  most  obvious  application  of  technology;  this  is  cer¬ 
tainly  the  one  we  have  spent  the  most  time  discussing  in  these  papers.  For 
one  thing,  the  context  of  a  useful  reading  machine  for  the  blind  is  one  that 
cannot  be  considered  well-specified  yet,  and  is  at  least  substantially  more 
restricted  than  other  more  general  kinds  of  visual  aids. 

I  have  said  that  the  decisions  ought  to  represent  a  balance  of  what  is 
useful  and  what  is  feasible.  Now,  many  things  are  not  feasible  at  all  today; 
feasibility  does  impart  a  limit.  It  would  be  pleasant  and  convenient  could 
our  reading  machines  read  handwriting,  for  example,  but  they  cannot  and 


Visual  Pattern  Recognition  281 

they  won’t  be  able  to  for  some  time.  Even  with  print  we  have  had  all  too 
little  experience  to  say  with  assurance  what  is  the  most  useful  level  of 
display  of  reading  machines.  It  is  possible  (and  by  “possible”  I  mean 
without  any  great  research  and  development  program)  to  build  today  a 
machine  that  will  read  text  in  some  given  font  of  type  and  transform  it  into 
spoken  words.  That  is,  there  is  an  absolute  feasibility  to  do  this,  and  I 
daresay  that  experimentally  some  part  of  this  has  already  been  done  in 
fact,  that  some  written  text  can  be  translated  into  spoken  words  albeit 
without  any  intonation  or  inflection.  The  machine  will  contain  in  its 
memory  different  recorded  spoken  sounds  of  a  thousand  or  several  thousand 
different  words  and  would  merely  string  them  together  and  ship  them  out 
through  a  loudspeaker. 

We  don’t  know,  of  course,  nor  have  we  any  idea  in  fact  of  how  useful 
this  would  be.  We  can  say  some  things  about  it.  For  one  thing,  the  reading 
speed  would  be  on  the  order  of  slowly  spoken  speech;  I  would  suppose 
120  to  150  words  a  minute.  The  words  would  not  be  strung  together; 
however,  I  am  convinced  it  would  not  be  an  impossible  job  for  Haskins 
Laboratories,  for  example,  to  provide  within  a  few  years  some  kind  of 
continuity  of  speech — to  run  a  vocoder,  as  it  were,  from  the  decisions  about 
particular  words  that  have  been  seen  by  the  character  reading  machine. 

I  am  not  necessarily  advocating  that  approach,  but  it  is  the  kind  of  thing 
we  can  do  today  at  a  level  of  absolute  feasibility.  It  ought  to  be  made  very 
clear  that  this  technique  would  involve  a  large  digital  computer.  I  think 
Herr  Kazmierczak’s  estimates  of  cost  are  excessively  low  for  this:  $100,000 
won’t  find  much  of  a  digital  computer  in  this  country — and  I  think  you 
need  a  large  one,  and  a  lot  of  extra  equipment,  too,  to  read  at  real-time 
speed.  It  would  be  of  no  use  for  a  blind  man  who  wanted  to  read  the  ad¬ 
vertisements  in  a  subway,  or  even  to  read  a  newspaper.  It  would  require 
a  fixed  font;  no  digital  computer  is  as  yet  very  portable  at  all;  and  of 
course  the  kinds  of  type  fonts  and  even  the  vocabulary,  in  advertisements 
as  well  as  in  the  rest  of  the  text,  are  not  yet  susceptible  to  pattern  analysis. 
We  know  how,  but  we  have  not  in  fact  done  so — and  it  is  a  long  step,  as 
I  will  say  again  and  again,  from  knowing  the  general  principles  of  ap¬ 
proach  to  a  problem  like  pattern  recognition  and  knowing  how  to  do  it 
technologically. 

There  are  two  obvious  questions  that  I  can  raise  here:  What  can  our 
technology  do  now  or  soon?  and,  What  do  we  want  it  to  do?  We  shall  dis¬ 
cuss  them  one  at  a  time.  There  are  several  basic  techniques  in  pattern 
recognition  by  machine.  Again,  I  repeat,  we  are  here  concerned  with  the 


282  Man-Machine  Systems 

recognition  of  complex  patterns  of  many  elements  rather  than  the  detection 
of  a  few  fixed  elements;  and  of  course  I  still  use  the  general  example  of 
reading  machines  as  a  kind  of  local  conceptual  environment. 

The  most  primitive  technique  I  call  “template  matching.”  Here,  the 
visual  signals  are  converted  wholesale  into  some  string  of  bits,  some  string 
of  choices,  which  as  a  whole  are  matched  against  stored  replicas  of  the 
26  letters.  Herr  Kazmierczak  gave  a  summary  of  assorted  ways  of  doing 
this;  most  of  his  techniques  were  essentially  template  matching.  Sample 
letters  have  of  course  to  be  correctly  positioned  and  oriented,  and  they 
have  to  be  of  the  same  type  font,  of  the  right  size,  and  not  too  dark  and 
not  too  light;  and  the  background  has  to  be  more  or  less  strictly  controlled. 
It  is  easy  to  describe  how  to  build  a  machine  like  this;  building  it  well  is 
another  matter — but  it  is  easy  to  describe  how  one  should  build  it,  and 
there  may  be  many  of  you  who  are  in  fact  associated  with  efforts  of  this 
kind.  With  typewritten  material  (that  is,  material  written  on  a  typewriter) 
there  are  in  fact  working  systems  available  right  now  that  use  template 
matching.  One  can  avoid  some  of  the  messier  problems  of  positioning  by 
clever  scanning  techniques.  The  IBM  character  reading  machine  now  reads 
only  numerals  of  the  kind  of  font  that  they  use  in  one  of  their  printing 
machines  (but  I  am  told  they  will  extend  it  to  a  full  range  of  alphanumeric 
characters).  It  uses  a  scanning  technique  in  which  it  moves  the  image  to  be 
recognized  through  all  or  nearly  all  possible  positions  and  waits  for  the  best 
check.  This  is  a  perfectly  sensible  procedure.  Very  clearly,  however,  the 
kinds  of  extrapolation  one  can  make  from  this  technique  are  very  limited. 
It  does  require  the  right  font,  it  does  require  the  right  orientation,  it  does 
require  the  right  size,  it  does  require  fairly  careful  control  of  noise.  Un¬ 
fortunately,  handwritten  or  handprinted  letters,  spoken  words,  and  even 
much  printed  text  typically  show  very  wide  variations  from  any  set  of 
templates  which  can  be  prepared  in  advance. 

We  can  do  a  little  better  by  providing  a  metric  to  measure  the  degree 
of  fit  of  the  unknown  image  with  every  template  and  then  pick  the  best 
fit.  We  can  also  supply  transformations  which  normalize  the  unknown 
image,  put  it  in  the  center  of  the  field  of  view,  and  use  some  other  kinds 
of  judgments  about  the  orientation  which  we  have.  These  measures  un¬ 
fortunately  do  not,  in  practice,  seem  to  rescue  the  situation.  The  greater 
the  variability  the  less  successful  template  matching  becomes,  and,  by  our 
standards,  very  little  variability  can  reduce  the  performance  of  the  system, 
as  far  as  we  have  been  able  to  find  out,  to  unacceptable  levels. 

I  have  to  conclude  that  for  the  general  kinds  of  requirements  as  I  see 
them,  template  matching  in  its  pure  form,  modified  by  matching  and  nor- 


Visual  Pattern  Recognition  283 

malization,  is  adequate  only  for  printing  and  typewriting.  I  fear  that  an 
all-font  print  reader  cannot  be  handled  with  template  matching. 

I  have  been  told  with  some  bitterness  by  store  clerks  and  bank  tellers 
that  my  signature  is  illegible — and  so  it  is.  But  it  is  very  easily  recognized, 
none  the  less;  even  so  it  is  never  the  same  twice.  Well,  how  is  it  “easily 
recognized”?  (I  can  assure  you  it  is  easily  recognized!)  The  answer  is  that 
it  is  recognized  by  having  certain  features  or  properties  (one  of  which  of 
course  is  illegibility — there  are  others).  All  of  you  who  have  legible  sig¬ 
natures  will  not  have  your  signatures  confused  with  mine  by  any  bank 
teller.  But  there  are  many  other  kinds  of  properties,  too.  These  properties 
are  typically  not  functions  of  any  particular  piece  of  the  signature;  the  in¬ 
teresting  properties  are  global  properties,  that  is,  functions  of  the  signature 
as  a  whole.  When  Herr  Kazmierczak  spoke  of  the  “particular  parts  of  the 
image  which  have  to  be  recognized,”  I  envy  him  for  being  able  to  talk 
about  this,  and  to  talk  about  the  “particularly  important  parts  of  the  char¬ 
acter,”  because  I  think  that  many  of  the  interesting  and  necessary  features 
for  recognition  are  those  properties  which  cannot  be  specified  or  localized 
to  a  small  part  of  the  field  of  view,  but  are  in  fact  functions  of  the  whole 
field  of  view. 

Characterizing  shapes  by  their  distinctive  features  or  properties  can 
ease  the  problem  of  variability.  It  is  easy  to  specify  features  or  properties 
which  are  invariant  over  a  far  wider  range  of  letters  than  can  be  encountered 
by  a  like  number  of  templates.  What  kind  of  features  am  I  talking  about 
here?  There  are  many  kinds  of  features.  Connectivity  is  a  global  feature:  Is 
the  letter  singly  connected  or  not?  Does  it  contain  a  straight  line  segment? 
Is  it  wider  at  the  top  than  at  the  bottom?  and  so  on;  these  are  all  global 
features.  There  are  some  interesting  features  which  are  not  global:  a  serif, 
for  example,  which  turns  out  apparently  to  increase  the  legibility  of  a  printed 
letter — that  is  not  global.  All  too  few  of  these  features  have  been  pro¬ 
grammed  or  built  into  machines  and  we  certainly  do  not  have  much  op¬ 
erational  experience  with  them. 

We  may  also  hope  that  a  list  of  the  properties  of  known  samples  can 
provide  a  basis  for  classifying  unknown  samples,  and  can  even  help  us 
generate  new  features  which  will  be  even  more  useful.  This  property  list — 
that  is,  describing  the  image  by  a  set  of  properties — does  not  eliminate  the 
problem  of  variability,  but  merely  transfers  it  to  another  level.  Yet,  as  far 
as  we  can  tell,  property  or  feature  extraction  is  a  step  in  the  right  direction 
and  has  proved  to  be  much  less  fallible  than  a  grand  transformation  which 
is  intended  to  reduce  all  inputs  to  one  of  a  finite  sized  set  of  templates.  If 
the  property  list  is  appropriately  designed  the  recognition  does  not  neces- 


284  Man-Machine  Systems 

sarily  lean  strongly  on  any  one  feature,  and  may  incorporate  correlated  or 
dependent  properties  for  correcting  errors  in  feature  extraction.  It  will 
probably  work  well  enough  to  be  useful  in  situations  where  template  match¬ 
ing  is  just  completely  inadequate.  We  have  found,  as  I  said,  that  we  have 
discovered  no  easy  way  to  build  machines  that  use  feature  extraction;  it 
seems  to  require  a  substantially  greater  complexity  than  template  matching. 
But  there  are  features  and  features.  One  kind  of  feature  that  I  mentioned 
was  the  presence  or  absence  of  a  serif,  or  worrying  about  connectivity.  But 
there  is  another  broad  kind  of  feature. 

Suppose  our  machine  sees  the  word,  TA - .  Suppose  the  machine 

can  also  tell  that  the  two  blanks  are  both  the  same  letter.  That  is  clearly 
another  kind  of  feature  and  a  very  useful  one  in  this  case,  because  it  im¬ 
mediately  restricts  the  possibilities  from  many  scores  of  words  to  just  one 
word,  the  word  TALL.  This  is  a  feature  that  interrelates  the  different 
parts  of  the  word,  and  here  it  is  a  feature  not  of  any  single  letter  but  a 
feature  of  the  word.  This  general  kind  of  feature  I  include  in  the  term 
“the  effect  of  context.”  Context  in  pattern  recognition  suggests  the  inter¬ 
relation  of  decisions  at  different  levels  in  the  sense  that  the  recognition  of 
a  letter  is  a  different  level  of  decision  from  the  recognition  of  a  word.  Con¬ 
text  in  one  form  or  another  is  an  accompaniment  of  nearly  all  pattern 
recognition  schemes  if  they  are  to  be  effective.  Often  this  effect  of  context 
is  implicit.  The  previous  speaker  mentioned  that  for  human  use  it  is  not 
nearly  so  important  to  get  every  word  right  because  we  can  rely  upon  the 
human  to  supply  the  effect  of  context  and  recognize  the  word  rather  than 
the  individual  letters.  But  to  use  context  effectively  will  demand  even  more 
from  the  technology. 

T  A - ,  where  the  blanks  represent  the  same  letter,  can  be  recog¬ 

nized  only  by  a  technique  that  has  an  acquaintanceship  with  a  very  large 
number  of  words  of  English.  At  a  guess,  Basic  English  (which  is  probably 
inadequate  I  would  reckon),  has  a  thousand  words,  each  word  containing 
on  the  order  of  30  bits  or  so.  It  must  also  be  able  to  run  rapidly  through  all 
of  the  four-letter  words  beginning  with  T,  A  in  the  case  I  just  spoke  of. 
This  kind  of  processing  is  not  always  a  speedy  one.  If  the  words  are  stored 
alphabetically,  running  through  all  of  the  four-letter  words  beginning  with 

T,  A  may  be  easy  enough;  but  if  we  have - L  S,  we  then  have  to  run 

through  all  the  four-letter  words  that  end  in  L  S.  This  is  much  harder  unless 
we  happen  to  have  a  rhyming  dictionary  in  the  machine. 

Let  us  suppose,  then,  that  the  demands  for  using  context  will  be  chiefly 
or  easily  met. 


Visual  Pattern  Recognition  285 

There  are  other  kinds  of  problems  of  pattern  recognition  which  I  shan't 
go  into  in  detail;  but  I  can  pose  the  questions  of  learning,  that  is,  whether 
the  machines  themselves  should  take  a  major  responsibility  in  learning  or 
adapting  to  the  exact  nature  of  the  patterns  that  they  must  recognize.  There 
is  no  doubt  of  the  power  of  this  in  general;  after  we  have  learned  the  habits 
of  speech  of  a  friend  it  is  much  easier  to  understand  him  over  a  noisy  tele¬ 
phone.  This  learning  can  take  several  forms  in  the  machine,  even  as  it  does 
with  us.  Not  the  least  important  of  the  forms  of  learning  is  the  selection  of 
suitable  features  for  recognition  or  for  incorporation  into  our  pattern 
recognizers.  All  that  I  will  say  at  this  point  is  that  many  of  us  working  in 
the  field  believe  that  the  complex  goals  that  we  are  aiming  for  will  need 
the  help  of  the  machine  for  a  large  part  of  the  learning  job.  We  have  had 
all  too  little  experience  in  this. 

When  we  try  to  get  machines  to  illustrate  their  own  abilities  (and  a 
large  number  of  people  are  very  concerned  with  the  “glamour”  of  this), 
we  tend  to  underestimate  the  truly  amazing  abilities  people  exhibit  in  recog¬ 
nizing  patterns.  Sometimes  it  is  in  merely  carrying  a  particular  technique 
to  its  extreme;  more  often  it  is  in  the  combination  of  different  kinds  of 
techniques,  different  kinds  of  abilities,  and  integrating  them  together.  I 
think  this  integration  of  many  different  techniques  is  probably  the  most 
impressive  facility  we  have.  In  vision  there  are  very  many  examples.  We 
can  recognize  faces  (by  “recognize”  I  don’t  necessarily  mean  to  recall  the 
names;  I  find  myself  unable  to  do  that  most  of  the  time),  but  we  can  recog¬ 
nize  faces  at  a  distance  by  clues  that  have  not  really  been  found.  We  do 
not  really  know  anything  about  how  we  recognize  a  face,  I’m  afraid,  in  the 
sense  of  being  able  to  build  a  machine  to  do  the  same  thing.  There  is  prob¬ 
ably  something  to  be  learned  from  the  cartoonists,  who  can  capture  (with¬ 
out  verbalizing  about  it)  the  recognizable  features  of  a  face  and  present  it 
to  us.  We  don’t  know  how  they  do  it,  and  neither  do  they,  but  they  do  it. 
Whether  this  is  something  that  we  can  find  out  how  to  build  into  machines 
or  not,  I  don’t  know;  we  haven’t  looked  into  it  in  any  case.  The  ability  to 
recognize  music  on  the  slightest  of  cues,  apparently,  I  think  is  absolutely 
incredible;  half  a  second  of  hearing  a  Caruso  record  and  you  know  that  it  is 
Caruso,  even  if  you  haven’t  heard  a  Caruso  record  for  many  years.  It  is  very 
hard,  looking  at  the  waveform,  to  see  what  it  is  that  you’re  recognizing,  but 
I  am  sure  that  most  of  you  have  had  this  experience.  A  couple  of  bars  of 
the  Brahms  Requiem,  and  you  know  that  it  is  the  Brahms  Requiem;  all  of 
you  have  had  many  experiences  of  this  type. 

When  we  talk  about  pattern  recognition  we  generally  like  to  specify 


286  Man-Machine  Systems 

ahead  of  time,  and  we  say,  “Well,  we  ought  to  know  the  kind  of  vocabulary 
that  we  are  recognizing.”  What  is  the  “field  of  view,”  as  it  were:  what  are 
the  choices?  To  me,  the  most  amazing  thing  that  people  do  is  to  recognize 
without  any  restriction  of  choices  (that  is,  with  the  restriction  of  choices 
as  broad  as  the  whole  of  our  experiences)  and  with  very  little  difficulty, 
that  record  of  Caruso.  To  take  another  example:  when  I  say,  “The  cow 
jumped  .  .  .”  all  of  you  can  finish  the  sentence.  It  is  very  hard  to  specify 
to  a  computer  what  kind  of  range  of  choices  this  represents,  without 
including  all  of  the  education  of  a  person.  Another  example,  at  a  higher 
grade  level  is,  “How  do  I  love  thee?”  and  you  can  all  finish  the  phrase. 
I  regard  this  as  a  very  surprising  kind  of  ability.  The  estimates  of  capacity 
of  a  computer  which  would  be  able  to  handle  this  task  would  not  I  think 
be  merely  in  the  thousands  or  hundreds  of  thousands  of  words. 

Let  me  summarize  now  what  our  technology  can  do,  having  just  pre¬ 
sented  some  of  the  examples  of  what  it  can  clearly  not  do,  and  what  it 
shows  no  great  promise  of  being  able  to  do  for  quite  some  time.  We  can 
recognize  patterns  with  machines  if  we  can  specify  them  well  enough. 
Today,  “well  enough”  means  letters,  but  not  faces;  it  means  words,  printed 
words,  but  not  words  in  handwriting;  it  means  printed  letters,  but  not 
handwritten  letters  or  script  letters.  We  have  only  slight  learning  ability 
in  our  machines.  In  the  learning  process  people  are  going  to  be  an  integral 
part  of  the  learning  process  of  the  machines  for  quite  some  time  to  come. 
I  say  this  very  sadly,  because  a  large  part  of  my  own  effort  is  devoted  to 
attempting  to  get  the  machines  to  do  more  of  the  learning.  We  can  only 
do  pattern  recognition  in  fairly  restricted  and  well-specified  contexts.  The 
effects  of  context  in  general  are  very  ill-known.  It  is  clear  that  they  are  im¬ 
portant;  the  context  of  single  words  we  can  imagine  building  in,  but  we  don’t 
really  know  how  to  design  or  to  optimize  them.  The  advantages  of  course  are 
clear;  we  can  illustrate  the  advantages  again  and  again  in  the  ways  people 
work,  but  we  don’t  know  how  to  apply  these  ways  to  machines  as  yet. 
We  know  that  features  are  important;  we  can  guess  that  features  are  crucial. 
But  we  don’t  know  how  to  implement  them  cheaply  or  easily.  We  don’t 
know  what  they  ought  to  be. 

A  larger  point,  and  one  which  I  think  concerns  this  audience  even  more, 
is  that  the  contexts  of  use  are  not  well  known.  They  have,  in  fact,  not  been 
very  well  studied  at  all.  Technology,  as  I  see  it,  depends  upon  the  orderly 
interaction  of  capabilities  and  experience.  In  the  context  of  reading  ma¬ 
chines  for  the  blind,  most  of  the  techniques  and  most  of  the  capabilities 
have  not  been  explored  at  all.  We  do  not  have  an  orderly  feedback  from 
operational  experience.  We  depend  upon  experience  to  ask  questions  and 


Visual  Pattern  Recognition  287 

to  pose  requirements  on  the  technology.  The  technology  has  also  been 
rather  worse  than  unimaginative  in  responding  to  the  requirements  which 
have  not  been  put  in  the  form  of  requirements  at  all;  it  is  not  enough  for 
technology  to  say,  “We  want  to  be  able  to  read.”  The  requirements  have 
to  be  better  specified  than  this.  We  must  not  demand  that  the  blind  be 
given  guidance;  that  is  not  specifying.  We  are  falling  here  into  the  same 
kind  of  difficulties  that  beset  computer  designers  some  years  ago,  upon  the 
realization  that  computers  could  in  principle  do  anything,  but  in  principle 
the  method  has  to  be  specified.  To  say  that  a  machine  can  “think”  is  a 
kind  of  truism  so  long  as  we  can  in  fact  specify  what  “thinking”  is  in  the 
right  kind  of  terms. 

One  can  make  some  further  statements  about  the  technology,  too.  The 
real  kind  of  advances  will  come  in  making  choices  about  technology,  and 
I  say  that  at  present  we  have  all  too  few  genuine  choices  to  make.  When 
I  say  that  we  should  have  operational  experience,  I  mean  that  the  only 
value  of  operational  experience  to  a  technologist  is  in  helping  him  make 
a  sensible  choice  of  courses  of  action.  If  not  very  many  different  courses 
of  action  can  be  posed  the  information  he  gets  about  this  success  will  not 
be  very  great. 

In  conclusion,  I  think  there  are  many  layers  of  processing  of  visual 
signals  to  semantic  content,  and  the  reading  device  must  present  informa¬ 
tion  at  an  appropriate  level  of  pattern  recognition.  It  is  just  not  feasible 
to  present  information  right  now  at  the  semantic  level.  We  don’t  really 
know  anything  about  the  capabilities  of  people  in  accepting  patterns  at  the 
different  intermediate  levels.  We  are  beginning  to  find  out  a  little  bit  about 
the  effectiveness  of  spelled  speech,  but  we  don’t  really  know  about  the 
ability  of  people,  or  their  operational  requirements,  to  handle  these  pat¬ 
terns  at  this  level.  We  know  the  scientific  principles  of  pattern  recognition 
and  the  generating  principles.  We  have  had  almost  no  experience  in  putting 
them  together  in  complex  machines,  and  it  is  surely  clear  that  in  pattern 
recognition,  the  levels  of  decisions  are  going  to  be  complex  ones  to  be 
useful  for  guidance  and  navigation  devices. 

The  demands  must  be  set  primarily  from  the  abilities  of  the  user  to 
perform  the  integration  that  will  be  up  to  him,  and  we  need  not  under¬ 
estimate  the  kinds  of  integration  that  he  can  do.  This  information  must 
be  presented  in  terms  appropriate  to  the  particular  nonvisual  sense  and 
appropriate  to  the  as  yet  unexplored  operational  environment.  We  are  not, 
it  seems  to  me,  very  far  along  in  knowing  how  to  do  this.  Thus,  while  I 
am  very  optimistic  about  the  long-range  usefulness  of  technology,  I  am 
not  very  hopeful  of  substantial  successes  soon. 


PSYCHOLOGICAL  CONSIDERATIONS  IN 
THE  DESIGN  OF  AUDITORY  DISPLAYS 
FOR  READING  MACHINES 

MICHAEL  STUDDERT-KENNEDY*  and 
ALVIN  M.  LIBERMAN** 

Haskins  Laboratories,  New  York,  New  York 


Our  concern  in  this  paper  is  with  the  choice  of  an  auditory  output  for  a 
reading  machine.  We  shall  consider  the  choice  primarily  from  the  point 
of  view  of  the  blind  person  who  is  to  use  it.  This  may  mean  that  we  end 
by  throwing  the  burden  on  the  engineer,  but  that  is  what  engineers  are  for, 
after  all,  and  it  is  not  unreasonable  that  the  blind  user  who  must  listen 
to  the  machine  should  have  a  say  in  what  it  sounds  like.  So  it  is  on  the 
blind  person’s  needs  and  human  limitations  that  we  shall  dwell. 

First,  his  needs.  They  are  simple.  He  needs  a  device  that  will  permit 
him  to  read  at  a  rate,  if  not  as  high  at  least  of  the  same  order  as  that  of 
the  sighted  reader.  A  reasonable  aim  would  perhaps  be  the  rate  of  normal 
speech,  that  is,  about  150  to  200  words  per  minute.  Such  a  rate  is  neces¬ 
sary  for  two  reasons.  First,  slow  reading  is  irksome,  especially  when,  as 
is  often  the  case  with  newspapers  and  magazines,  the  ratio  of  content  to 
text  is  deliberately  kept  low  to  facilitate  fast  reading,  easy  comprehension, 
and  mass  circulation.  Second,  slow  reading  is  inefficient;  immediate  mem¬ 
ory  is  short  and  the  slow  reader  may  forget  the  beginning  of  the  paragraph 
or  even  the  sentence  before  he  reaches  the  end.  The  fast  reader  takes  in 
the  sweep  even  if  he  occasionally  misses  the  detail;  the  slow  reader  may 
take  in  neither  sweep  nor  detail.  In  the  interest,  then,  of  both  pleasure 
and  efficiency  the  blind  reader  may  reasonably  ask  for  a  device  that  will 
read  to  him  at  least  at  a  normal  speaking  rate. 

We  are  not,  of  course,  denying  the  utility  of  less  ambitious  devices.  A 
cheap,  portable  machine,  even  of  limited  performance,  has  obvious  value 
for  the  personal  use  of  many  blind  persons,  and  we  have  no  wish  to  be- 

*  Also  at  Barnard  College,  Columbia  University,  New  York 

**  Also  at  the  University  of  Connecticut,  Storrs,  Connecticut 


289 


290  Man-Machine  Systems 

little  research  efforts  directed  toward  the  development  and  improvement 
of  such  devices.  Here,  however,  we  have  chosen  to  confront  the  larger 
challenge  of  a  high  performance  reading  machine,  the  library  installation, 
that  will  give  the  blind  reader  access  to  the  world  of  books  from  which 
he  is  presently  shut  out. 

Turning  now  to  the  larger  matter  of  the  listener’s  limitations,  we  find 
it  illuminating  to  compare  spoken  with  written  language.  Speech,  the 
primary  medium,  is  a  continuous  flow  of  sound  arrayed  in  time;  written 
language  is  its  discontinuous  visual  representation  arrayed  in  space.  This 
discontinuity  or  segmentation  of  written  language  is  the  crux  of  our  prob¬ 
lem.  Alphabets  were  surely  good  solutions  to  the  problem  of  segmenting 
the  acoustic  stream  for  symbolic  presentation  in  nonacoustic  form.  But 
from  our  immediate  point  of  view  the  solution  was  all  too  good,  since 
we  are  now  faced  with  the  reverse  problem  of  how  to  put  the  pieces  to¬ 
gether  again. 

How  does  the  sighted  reader  do  this?  First  he  scans  the  spatial  array 
of  print  and  in  so  doing  transforms  it  back  into  a  temporal  array.  But  he 
does  not  do  this  continuously;  though  the  subjective  impression  may  be 
of  his  eyes  steadily  sweeping  the  print,  in  fact  they  are  moving  in  small 
discrete  leaps,  called  saccadic  movements,  each  lasting  some  10  to  40 
milliseconds.  Between  movements  his  eyes  fixate  the  print  and  it  is  only 
during  fixation  that  retinal  stimulation  is  effective.  The  number  of  fixations 
per  line  varies  with  the  reader  and  with  the  material  being  read.  A  good 
college  freshman  studying  a  text  of  average  difficulty  will  fixate  four  or 
five  times  a  line.  Each  fixation  will  last  around  200  to  250  milliseconds  and 
total  fixation  time  will  be  90  to  95  percent  of  total  reading  time  (19). 
During  each  fixation  the  reader  “takes  in”  several  words.  Since  the  fovea, 
the  retinal  region  of  highest  acuity,  covers  a  visual  angle  of  approximately 
2  degrees,  only  one  or  two  letters  are  clearly  in  focus;  the  letters  peripheral 
to  the  fovea  are  somewhat  blurred.  The  reader  takes  another  peripheral 
look  at  the  blurred  letters  during  his  next  fixation  and  thus  achieves  a  visual 
substitute  for  the  continuity  of  speech.  Strictly,  of  course,  he  is  still  seg¬ 
menting  the  flow.  But  by  grouping  lines  into  letters  and  letters  into  syllables, 
words,  and  phrases  during  fixation  and  by  cutting  the  dead  time  of  sac¬ 
cadic  movement  between  fixations  to  a  minimum,  he  effectively  achieves  a 
continuous  intake  at  a  rate  even  faster  than  that  of  spoken  language. 

We  may  remark  that  exactly  the  same  sequence  of  rapid  eye  movement 
interspersed  with  relatively  long  fixations  is  followed  when  we  look  around 
us  and  group  the  chaos  of  the  external  world  into  people,  trees,  houses, 
and  objects.  We  do  not,  of  course,  see  every  detail  any  more  than  the 


Psychological  Considerations  29 1 

reader,  but  by  long  practice  we  have  learned  to  focus  on  the  essential 
cues — brightness  contrast,  continuity  of  line  or  surface,  pattern  repetition 
— and  so  broadly  organize  the  visual  world  “at  a  glance,”  as  we  somewhat 
inaccurately  say.  It  is  this  spatial  span  of  vision  that  the  automatic  scanning 
guidance  device  is  unable  to  recover  from  the  free  field.  Perhaps  a  reading 
machine  can  do  a  better  job  for  the  printed  text. 

We  have  stressed  the  organizing  power  of  vision,  and  we  have  implied 
that  for  a  high  performance  reading  machine  we  must  provide  some  audi¬ 
tory  counterpart  of  this  power.  The  question  now  arises  as  to  where  the 
organization  is  to  take  place,  in  the  machine  or  in  the  listener?  To  be  more 
exact,  how  much  organization  should  take  place  in  the  machine  and  how 
much  in  the  listener? 

If  we  place  all  the  burden  of  organization  on  the  listener  we  are  choos¬ 
ing  what  has  been  called  a  direct  translation,  nonintegrating  type  of  ma¬ 
chine;  the  Optophone  is  a  famous  example.  Such  a  machine  presents  the 
listener  with  a  series  of  sounds  that  “are  generated  from,  and  vary  in  ac¬ 
cordance  with,  the  continuously  changing  contours  of  the  print”  (2). 
Certainly  this  is  a  faithful  translation  of  the  text,  but  how  unlikely  it  is 
to  yield  high  reading  rates  one  can  judge  by  imagining  oneself  visually 
reading  a  text  through  a  slit  so  narrow  as  to  offer  only  a  fragment  of  a 
letter  at  a  time.  One’s  level  of  organization  would  hardly  develop  beyond 
the  letter.  And  we  surely  want  our  translation  to  catch  more  than  the 
letter  of  the  text. 

There  is  a  more  important  reason  why  we  cannot  expect  optimal  read¬ 
ing  rates  from  such  a  device,  and  this  is  the  fact  that  the  ear  has  relatively 
low  resolving  power  in  time.  A  series  of  clicks  or  other  brief  sounds  are 
heard  as  discrete  only  so  long  as  they  are  not  repeated  more  than  about 
20  times  a  second.  Above  that  rate  they  merge  and  are  heard  first  as  a 
buzz  and  then  as  a  steady  tone  of  rising  pitch.  Clicks  merge  into  a  buzz 
even  when  they  differ  from  each  other  in  frequency  composition,  intensity, 
and  duration.  In  International  Morse  Code,  for  example,  each  letter  is 
coded  into  a  distinctive  series  of  long  and  short  wave  trains  of  a  1000  cps 
tone  with  an  average  of  three  wave  trains  per  letter.  Since  the  average 
English  word  contains  5  letters  (or  15  Morse  Code  elements),  a  rate  of  1 
word  per  second  or  60  words  per  minute  is  close  to  buzz  threshold  and  the 
presumptive  upper  limit  for  Morse  Code  reception.  In  fact,  operating  rates 
are  considerably  lower  than  this:  commercial  radio  stations  send  and  re¬ 
ceive  at  no  more  than  30  to  40  words  a  minute,  federally  licensed  radio 
amateurs  at  13  words  a  minute  (2). 

An  upper  limit  of  roughly  the  same  rate  may  be  expected  from  such 


292  Man-Machine  Systems 

devices  as  the  Optophone.  In  fact,  the  most  proficient  user  of  the  Opto¬ 
phone,  Miss  Mary  Jameson,  gave  public  demonstrations  of  reading  at 
reported  rates  of  60  words  a  minute(2).  Other  users  achieved  much  lower 
rates. 

The  fact  that  most  users  of  such  codes  achieve  rates  so  far  below  the 
theoretical  upper  limit  is,  incidentally,  of  some  interest.  Both  International 
Morse  Code  and  the  Optophone  use  unidimensional  codes;  that  is  to  say, 
the  letter  symbols  differ  from  each  other  along  a  single  dimension — dura¬ 
tion  pattern  for  Morse  Code,  frequency  for  the  Optophone.  Pollack  (16, 
17)  and  others  (5,  11,  14)  have  shown  that  listeners  identify  more  ac¬ 
curately  stimuli  that  differ  along  several  dimensions,  and  it  may  be  that 
some  improvement  in  rate  could  be  achieved  by  suitable  complication 
of  the  signal.  Of  course,  increased  accuracy  of  identification  may  not 
yield  increased  speed  of  identification,  but  Eriksen  (5)  has  found  an  in¬ 
creased  rate  of  identification  for  complex  visual  stimuli  under  certain  con¬ 
ditions,  and  it  is  possible  that  a  similar  gain  might  be  achieved  with  auditory 
stimuli. 

Be  that  as  it  may,  no  amount  of  stimulus  complexity  can  overreach 
the  ear.  There  is  a  physiological  upper  limit  to  the  rate  at  which  a  listener 
can  discriminate  between  successive  discrete  auditory  stimuli,  and  if  he 
cannot  discriminate  between  them  he  clearly  cannot  organize  them.  Our 
first  conclusion,  then,  is  that  we  cannot  place  all  the  burden  of  organization 
on  the  listener;  the  machine  must  do  something,  too.  The  question  is,  how 
much  should  it  do? 

Let  us  recall  the  sighted  reader.  His  first  level  of  organization  is  the 
letter.  If  we  assign  this  task  to  the  machine  we  are  choosing  what  has  been 
called  a  recognition,  letter-reading  type  of  machine  (2).  Such  a  machine 
identifies  each  letter  discretely  and  yields  a  corresponding  sound.  The 
sound  need  not  resemble  the  letter,  but  the  task  of  the  listener  is  simpler 
if  he  does  not  have  to  learn  a  new  code,  and  letters  are  the  obvious  choice. 
To  attain  reasonable  speeds  the  letters  or  their  phonemic  equivalents  must 
be  emitted  rapidly  and  in  close  succession.  Even  if  phonemic  equivalents 
are  selected  there  is  a  limit  on  how  rapidly  they  can  follow  each  other. 
One  does  not  arrive  at  speech  by  simply  compressing  the  intervals  be¬ 
tween  phonemes;  the  result  of  high  compression  will  be  an  unintelligible 
blur,  not  the  smooth  flow  of  speech.  In  other  words,  the  letters  or  phonemes 
must  remain  discrete  and  an  upper  limit  on  their  rate  of  emission  will 
again  be  set  by  the  ear’s  temporal  acuity.  Each  pronounced  letter  con¬ 
stitutes  a  syllable  composed  on  the  average  of  two  elements  or  phonemes. 


Psychological  Considerations  293 

Each  spelled  word  five  letters  long  will  thus  have  approximately  ten  ele¬ 
ments.  These,  one  may  predict,  will  merge  a  high  rates  of  spelling,  say 
around  100  words  a  minute  in  theory,  somewhat  less  in  practice.  This  is 
certainly  an  advance  on  the  direct  translation,  nonintegrating  type  of 
machine.  There  are  perhaps  advantages  in  this  system  from  certain  points 
of  view  which  Dr.  Metfessel  describes.*  We  still  cannot  expect,  however, 
optimal  reading  rates  from  the  method. 

Nor  is  this  weakness  solely  due  to  the  ear’s  limited  temporal  acuity; 
it  is  inherent  in  the  display  itself,  as  becomes  clear  if  we  consider  how 
the  visual  system  would  handle  such  a  display.  As  we  know,  the  visual 
system  can  process  printed  text  at  a  high  rate;  but  could  it  do  so  if  the 
text  consisted  of  single  letters  briefly  presented  in  rapid  succession — the 
visual  counterpart  of  spelled  speech?  Surely  not.  In  short,  if  we  want  an 
auditory  display  comparable  with  the  visual  display  of  printed  text  we 
must  ask  more  of  our  machine  than  that  it  should  spell  out  loud. 

To  see  how  much  more  let  us  recall  once  again  the  sighted  reader. 
If  his  first  level  of  organization  is  the  letter,  his  second  is  at  least  the 
syllable,  if  not  the  word  or  phrase.  And  if  we  assign  this  task  to  the  ma¬ 
chine — as  it  seems  we  must — we  are  choosing  a  recognition,  syllable-or 
word-reading  type  of  machine.  That  is  to  say,  we  are  choosing  a  machine 
with  a  speech  or  at  the  very  least  a  speech-like  output. 

The  nature  of  the  machine  or  machines  that  could  yield  such  an  output 
we  will  discuss  below.  Here  we  may  remark  that  spoken  language  has  one 
obvious  advantage  over  spelled  language:  the  phoneme  elements  that 
carry  the  information  are  encoded  into  higher  order  units — syllables.  This 
is  not  to  say  that  the  minimal  acoustic  unit  of  speech  is  necessarily  the 
syllable  or  that  the  phonemic  segments  interact  and  lose  their  identity  in 
some  higher  Gestalt.  On  the  contrary,  the  evidence  is  that  speech — as 
produced  and  very  possibly  as  perceived — may  be  described  as  the  sum 
of  independent,  articulatory  components  at  a  level  below  even  that  of 
the  phoneme  (13).  Nonetheless,  the  packaging  of  the  phonemic  segments 
into  syllables  in  speech,  as  compared  with  their  discrete  delivery  in  spelled 
speech,  has  the  consequence  that  whether  we  regard  the  phoneme  or  the 
syllable  as  the  essential  acoustic  element,  the  number  of  elements  per 
word  is  reduced.  If  we  take  the  phoneme  as  our  element,  we  have  instead 
of  approximately  10  elements  per  word  as  in  spelled  speech  an  average 
of  around  3  or  4 — a  reduction  by  a  factor  close  to  3.  Thus  we  might 


*  See  below,  following  paper. 


294  Man-Machine  Systems 

expect  that  speech  would  become  an  unintelligible  buzz  at  a  rate  of 
about  5  words  a  second  or  300  words  a  minute.  With  the  syllable  as  our 
element  we  would  predict  an  even  higher  upper  limit.  In  practice,  around 
200  words  a  minute  is  probably  as  fast  a  rate  as  one  can  comfortably 
listen  to  for  any  extended  time.  A  reading  machine  with  such  an  output 
will  therefore  still  yield  reading  rates  below  those  of  sighted  readers.  But 
that  is  in  the  nature  of  the  auditory  stimulus  and  of  the  ear,  and  we  may 
be  confident  that  we  shall  find  no  auditory  display  with  a  higher  rate  of 
information  transfer  than  speech. 

Perhaps  we  may  seem  to  have  sidestepped  the  problem.  We  started 
by  asking  how  a  reading  machine  might  best  recover  the  acoustic  flow 
that  language  lost  when  it  went  into  print,  and  our  answer  turns  out  to  be: 
by  recovering  the  acoustic  flow  that  language  lost  when  it  went  into  print. 
Obviously  we  would  not  be  rash  enough  to  make  the  suggestion  were  a 
reading  machine  with  a  speech  or  speech-like  output  not  feasible.  But 
recent  work  suggests  that  it  is.  Furthermore,  a  speech  output  has  more 
advantages  over  nonspeech  than  a  mere  reduction  in  the  number  of  discrete 
elements  per  word  and  the  consequent  increase  in  reading  rate.  Before  we 
turn  to  the  machine  itself,  perhaps  it  would  be  worth  digressing  for  a 
moment  to  consider  these  advantages  and  some  of  the  reasons  for  them. 

First  and  foremost,  of  course,  speech  is  speech — that  is  to  say,  a 
highly  efficient  auditory  code  with  which  the  listener  is  already  familiar. 
If  the  output  of  the  machine  is  plain  English — or  plain  any  other  lan¬ 
guage — what  more  can  we  ask?  The  listener  is  ready  to  use  the  machine 
with  no  more  training  than  is  needed  to  operate  it. 

But  are  the  advantages  of  speech  simply  those  of  familiarity?  We 
said  above  that  we  might  be  confident  we  would  find  no  auditory  display 
with  a  higher  rate  of  information  transfer  than  speech.  The  sounds  of 
speech  are  indeed  efficient  vehicles  of  information,  for  they  may  not  only 
pour  from  the  speaker  at  a  rate  close  to  the  limit  of  the  human  receiver’s 
temporal  resolving  power,  but  they  may  also  be  accurately  identified  at 
that  rate.  Now  it  is  well  known  that  man’s  ability  to  discriminate  (that  is, 
to  determine  whether  two  stimuli  are  the  same  or  different)  is  very  good, 
but  that  his  ability  to  identify  in  absolute  terms  (that  is,  to  determine 
which  stimulus  it  is)  is  relatively  poor.  But,  as  we  have  already  remarked, 
much  experimental  work  in  recent  years  has  shown  that  humans  identify 
members  of  a  set  of  complex  or  multidimensional  stimuli  with  considerably 
more  accuracy  than  they  do  members  of  a  set  of  unidimentional  stimuli. 
For  example,  Pollack  and  Ficks  (17)  found  that  they  were  able  to  transmit 
to  their  subjects  as  much  as  6.9  bits  of  information  per  stimulus  when  their 


Psychological  Considerations  295 

stimuli  were  drawn  from  sets  of  auditory  stimuli  taking  one  of  two  values 
along  each  of  eight  dimensions.  This  is  considerably  more  than  the  upper 
limit  of  2.3  bits  that  Pollack  (16)  was  able  to  transmit  when  stimuli 
were  drawn  from  a  set  of  auditory  stimuli  varying  along  the  single  dimen¬ 
sion  of  pitch. 

Some  of  our  capacity  for  rapid  and  accurate  identification  of  speech 
sounds  may  be  due,  then,  to  the  complexity  of  these  sounds.  But  we  may 
note  that  Pollack  and  Fick’s  subjects  did  not  work  under  pressure;  in  fact, 
they  took  as  much  time  as  they  needed  to  identify  the  stimuli  by  marking 
a  check  list.  Whether  the  rate  of  information  transmission  is  increased 
as  much  as  its  quantity  by  complication  of  the  stimulus  we  do  not  know; 
but  we  may  reasonably  suppose  that  the  multidimensional  nature  of  speech 
sounds  is  a  necessary,  if  not  a  sufficient,  condition  of  their  being  so  pre¬ 
cisely  identifiable. 

There  are,  in  fact,  good  reasons  for  believing  that  there  is  more  to  the 
perception  of  speech  than  this.  Several  lines  of  evidence  suggest  that  speech 
stimuli  are  perceived  by  reference  to  articulatory  as  well  as  acoustic  di¬ 
mensions.  For  example,  the  acoustic  stimulus  does  not  always  display  the 
invariance  that  our  invariant  perceptual  response  would  lead  us  to  expect. 
The  acoustic  cues  for  the  perception  of  a  given  phoneme  sometimes  dis¬ 
play  abrupt  discontinuities  that  are  not  reflected  in  the  response:  the  re¬ 
sponse  like  the  articulation  remains  unchanged.  Again,  there  are  other 
situations  in  which  precisely  the  reverse  occurs:  discontinuities  appear  in 
both  the  perception  and  the  articulation,  but  not  in  the  acoustic  stimulus. 
For  example,  many  speech  sounds  are  perceived  categorically;  that  is  to 
say,  the  change  from  the  perception  of  one  phoneme  to  the  perception  of 
another  as  we  move  smoothly  along  some  acoustic  continuum  is  not 
gradual,  but  abrupt.  This  categorical  perception  is  paralleled  by  cate¬ 
gorical  articulation. 

The  explanation  for  these  discrepancies  may  lie  in  a  theory  of  speech 
perception  that  we  have  developed  more  fully  elsewhere  (12),  namely, 
that  the  perception  of  speech  is  linked  to  the  feedback  from  the  speaker’s 
own  articulatory  movement.  According  to  this  theory,  the  listener  learns 
a  connection  between  speech  sounds  and  their  appropriate  articulations. 
In  time,  the  articulatory  movements  (or,  more  likely,  the  corresponding 
neurological  processes)  come  to  mediate  between  the  incoming  acoustic 
stimulus  and  its  ultimate  perception.  If  this  is  so,  we  should  expect  that 
at  points  where  articulation  and  sound  divide,  perception  should  follow 
articulation. 

Thus  there  is  a  body  of  evidence — treated  in  detail  elsewhere — sug- 


296  Man-Machine  Systems 

gesting  that  speech  sounds  are  perceived  by  reference  to  the  articulatory 
movements  that  produce  them  and  that  this  articulatory  reference  is  im¬ 
portant  for  their  rapid,  absolute  identification. 

Returning  now  to  our  theme,  we  may  draw  a  conclusion.  If  our  ac¬ 
count  of  the  perception  of  speech  by  reference  to  articulation  is  valid  the 
advantages  of  accurate  and  rapid  identification  of  sound  elements  will  ac¬ 
crue  to  any  output  that  may  be  articulated,  but  not  to  “unspeakable” 
stimuli  that  merely  resemble  speech  acoustically.  Thus  even  if  the  output 
of  a  reading  machine  is  far  from  standard  English — or  any  other  known 
language — but  is  rather  some  strange  yet  pronounceable  machine  dialect, 
the  listener  will  be  able  to  follow  it  at  a  rapid  rate. 

Of  course  he  will  have  to  learn  the  dialect.  But  here  again  speech  has 
an  advantage  over  nonspeech:  it  is  more  easily  learned.  At  the  Haskins 
Laboratories  some  years  ago  experiments  were  conducted  on  the  ease  of 
learning  real  and  simulated  nonspeech  outputs  from  various  types  of  read¬ 
ing  machines.  “A  synthetic  pronounceable  language  (known  as  Wuhzi) 
based  on  a  transliteration  of  written  English  which  preserved  the  phonetic 
patterns  of  the  words”  was  also  used  for  purposes  of  comparison  (2). 
Wuhzi  was  learned  far  more  rapidly  and  yielded  a  markedly  higher  terminal 
performance  than  any  of  the  nonspeech  outputs.  More  recently,  House  et  al. 
(8),  have  demonstrated  a  similar  learning  advantage  for  speech  over  non¬ 
speech  stimuli  that  resembled  speech  acoustically  but  were  not  pronounce¬ 
able.  In  commenting  on  their  results  the  authors  say,  “.  .  .  an  understanding 
of  the  process  of  speech  perception  cannot  be  achieved  through  experi¬ 
ments  that  study  classical  psychophysical  responses  to  complex  acoustic 
stimuli.  Although  speech  stimuli  are  accepted  by  the  peripheral  auditory 
mechanism,  their  interpretation  as  linguistic  events  transfers  their  processing 
to  some  nonperipheral  center  where  the  detailed  characteristics  of  the 
peripheral  analysis  are  irrelevant.” 

To  sum  up,  a  speech  or  speech-like  output  from  a  reading  machine 
has  several  advantages.  First,  compared  with  both  nonspeech  and  spelled 
speech,  speech  reduces  the  number  of  discrete  elements  per  word  and 
permits  the  transmission  of  information  at  a  rate  that  is  both  rapid  and 
well  within  the  resolving  power  of  the  ear.  Second,  compared  with  non¬ 
speech,  speech  sounds  are  highly  distinctive  and  may  be  identified  with 
far  greater  accuracy  and  speed.  Third,  any  coded  output  in  a  speech-like 
— that  is,  pronounceable — form  is  readily  learned,  and,  of  course,  if  the 
language  is  plain  English  the  learning  has  already  been  done  and  the 
listener  is  at  home  from  the  start. 


Psychological  Considerations  297 

We  have  said  that  a  reading  machine  with  a  spoken  output  may  be 
feasible.  Now  we  would  like  to  support  our  statement  by  describing  briefly 
two  classes  of  machine  that  are  being  developed  at  the  Haskins  Labora¬ 
tories.  The  first  generates  speech  by  rule  from  the  individual  letters  of 
the  text;  the  second  compiles  speech  by  sorting  recordings  of  the  individual 
words  of  the  text  into  appropriate  sequences.  Both  machines  require  char¬ 
acter  recognition  units.  The  first  will  also  require  a  logic  unit  containing 
rules  for  selecting  the  phonemic  equivalents  of  letters  appropriate  to  their 
context;  the  second  will  require  a  large  random  access  memory  or  dic¬ 
tionary  for  storage  of  the  prerecorded  words.  While  these  hardware  re¬ 
quirements  may  well  be  met  in  the  not  too  distant  future,  they  are  the 
province  of  the  engineer.  Here  we  are  concerned  solely  with  the  spoken 
output  of  the  machines. 

First,  let  us  consider  the  synthesis  of  speech  by  rules  from  a  phonemic 
input.  The  basis  for  such  a  set  of  rules  has  been  laid  by  a  long  series  of 
researches  in  our  Laboratories  into  the  acoustic  cues  for  the  perception 
of  speech.  In  this  research  spectrographic  analyses  of  speech  have  been 
studied  minutely  and  reduced  to  a  skeletal  form  in  which  only  their  es¬ 
sential  features  are  preserved.  Figure  1  illustrates  the  procedure.  The  top 
line  shows  a  print  of  the  original  spectrogram  of  the  words,  “Never  kill  a 
snake.”  The  middle  line  shows  a  reduced  version  painted  by  hand.  The 
bottom  line  shows  the  final  simplified,  painted  version  arrived  at  by  trial 
and  error  in  which  the  essential  acoustic  cues  are  preserved  and  even 
brought  into  relief.  This  pattern  is  effectively  a  set  of  graphic  instructions 
for  the  frequency  and  amplitude  display  and  its  changes  over  time  necessary 
to  yield  an  intelligible  version  of  the  original  sentence.  If  the  pattern  is 
reconverted  into  sound  (by  means  of  a  photoelectric  device  known  as  the 
Pattern  Playback  [4])  such  a  version  will  be  heard. 

By  extensive  research  over  the  last  ten  years,  techniques  have  been 
developed  for  the  hand  painting  of  simplified  spectrograms — or,  in  other 
words,  for  the  synthesis  of  speech — without  reference  to  original  spec¬ 
trograms,  and  these  techniques  have  been  explicitly  formulated  as  a  set 
of  rules  for  the  synthesis  of  speech  by  Liberman  et  al.  (13).  These  rules 
are  designed  to  be  “few  in  number,  simple  in  structure,  and  susceptible  of 
mechanization”  to  convert  a  string  of  phonemes  into  reasonably  intelligible 
speech  at  normal  speaking  rates.  While  this  is  not  the  place  for  an  ex¬ 
haustive  discussion  of  the  rules,  it  may  be  of  interest  for  us  briefly  to 
examine  their  structure. 

Earlier  we  remarked  that  the  phonemic  elements  of  speech,  even  though 


Figure  1  Spectrogram  Patterns.  Top:  spectrogram  of  the  words,  “Never 
kill  a  snake.”  Middle:  a  reduced  version  of  the  spectrogram  painted  by  hand. 
Bottom:  a  further  simplification,  with  the  principle  features  accentuated; 
painted  for  use  on  the  Pattern  Playback.  For  further  explanation  see  text. 


Psychological  Considerations  299 

blended  by  the  articulatory  process  into  higher  order  syllabic  units,  do  not 
lose  their  identity:  they  are  independent  or  additive.  This  is  not  to  say  that 
the  context  in  which  a  phoneme  occurs  is  irrelevant.  In  applying  the  rules 
for  the  production  of  a  given  phoneme  one  must  know  the  appropriate 
formant  levels  for  adjacent  phonemes  so  that  the  units  may  be  sequentially 
combined  into  a  smoothly  flowing  pattern.  For  some  few  phonemic  com¬ 
binations  it  is  even  necessary  to  write  qualifications  or  “position  modifiers” 
of  the  basic  rules,  but  for  the  most  part  the  rules  may  be  so  written  that 
satisfactory  sequential  accommodations  are  achieved  without  modifiers. 
In  principle  the  number  of  rules  need  scarcely  be  greater  than  the  number 
of  phonemes. 

Furthermore,  since  the  acoustic  cues  for  the  perception  of  phonemes 
fit  readily  into  the  linguist’s  articulatory  classification,  a  further  reduction 
in  the  ratio  of  rules  to  phonemes  is  achieved  by  writing  the  rules  in  terms 
of  the  subphonemic  or  articulatory  dimensions  of  place,  manner,  and 
voicing.  Statements  must  then,  of  course,  be  included  within  the  sub- 
phonemic  rules  to  permit  their  simultaneous  combination.  The  rule  for  a 
stated  class  of  phonemes  thus  consists  of  all  the  statements  necessary  to 
specify  the  acoustic  cues  on  each  of  the  subphonemic  dimensions  and  to 
permit  their  simultaneous  and  sequential  combination. 

An  example  of  synthesis  by  minimal  rules  is  shown  in  Figure  2.  The 
general  comments  about  the  relations  between  rules  apply  in  the  case  of 
the  phoneme  /b/,  for  example,  as  follows.  The  set  of  rules  for  /b/  contains 
separate  rules  for  the  stop  consonants  /pbtdkg/,  the  labials  /pbfvm/,  and 
the  voiced  consonants  /bdg/ ;  certain  statements  in  these  rules  must  be  com¬ 
bined  simultaneously  to  specify,  for  example,  the  “silence”;  and  certain 
other  statements  must  be  combined  sequentially  with  the  rules  for  /'&/  and 
/z/,  in  particular  those  that  generate  formant  transitions. 

We  have  so  far  made  no  mention  of  prosodic  features.  In  particular  we 
have  not  mentioned  stress,  for  which  some  provision  in  the  rules  is  clearly 
essential.  In  natural  speech  stress  is  signaled  by  variations  in  one  or  more 
of  the  acoustic  dimensions  of  fundamental  frequency,  intensity,  and  duration. 
In  the  minimal  rules  of  Liberman  et  al.  (13),  only  duration  is  used  and  only 
one  stress  modifier  is  included  in  the  set  of  rules.  An  example  of  this  is 
illustrated  in  Figure  2  in  the  position  cell  for  /se/. 

To  round  out  this  brief  discussion,  Figure  3  gives  an  example  of  a 
longer  utterance  painted  by  rule.  This  was  one  of  the  earliest  attempts  and 
was  painted  for  use  on  the  Pattern  Playback.  Figure  4  shows  a  more  elab¬ 
orate  example,  also  painted  without  reference  to  a  spectrogram,  this  time 
for  use  with  the  Voback  synthesizer  (1).  Note  that  this  includes  a  pitch 


FREQUENCY  IN  CPS 


300 


Man-Machine  Systems 


SYNTHESIS  BY  RULE:  /laebz/ 


Resonant s  /wrly/: 

Long  Vtrxels  /ieeiaoo/: 

Stops  /pbtdkg/: 

Fricatites  /fv08sz$j/: 

Periodic  sound  (buzz) ; 

Periodic  sound  (buzz) ; 

No  sound  at  formant 

Aperiodic  sound  (hiss) , 

formant  intensities 

formant  intensities 

frequencies;  i.e., 

intensity  and  band 

u 

and  durations  arc 

and  durations  are 

"silence.” 

width  are  specified 

a 

specified. 

specified. 

Burst  of  specified 

% 

frequency  and  band 

5 

FI  locus  is  high. 

width  follows  "silence.” 
FI  locus  is  low. 

FI  locus  is  intermediate 

Formants  have  explicit 

F2  and  F3  have  virtual 

F2  and  F3  have  virtual 

loci. 

loci. 

loci. 

/!/: 

/*/: 

Labials  /pbfvm/: 

Aheolars  /tdsz/: 

C* 

F2  and  F3  loci  are 

Formants  frequencies 

F2  and  F3  loci  are 

F2  and  F3  loci  are 

3 

specified. 

specified. 

specified. 

specified. 

Cl 

Frequencies  of  buzz 

Frequencies  of  buzz 

and  hiss  are  specified 

and  hiss  are  specified. 

t L 

C 

(The  voicing  rules  are  onlv  applied  to  those  phonemes  for 

Voiced  /bdg/: 

Voiced  /v5z3/. 

which  the  condition  of  voicing  has  differential  value.  For 

Voice  bar. 

Voice  bar. 

U 

the  resonants  and  vowels,  which  are  invariably  voiced, 

Duration  of  “silence’’ 

Duration  of  hiss  is 

O 

> 

the  acoustic  features  correlated  with  voicing  are  specified 

is  specified. 

specified. 

under  Manner.) 

FI  onset  is  not  delayed. 

FI  onset  is  not  delayed. 

Position 

Vowels  in  final  syllable : 

Duration  is  double  that 

specified  under  Manner. 

4800-1 

4200- 

3600- 


3000- 


2400- 

1800- 

1200- 

600- 

o- 


Figure  2  Word  Synthesis.  Top:  rules  for  the  synthesis  of  the  word,  “Labs” 
(/laebz/).  Bottom:  a  reduced  spectrogram,  painted  by  hand  according 
to  the  rules,  for  use  on  the  Pattern  Playback.  For  further  explanation  see 
text. 


Psychological  Considerations  301 

line,  providing  an  added  cue  for  intonation.  A  tape  recording  of  these 
utterances  has  been  made. 

Obviously,  such  a  set  of  rules  achieves  its  parsimony  and  simplicity  by 
reducing  the  acoustic  specifications  to  those  elements  essential  for  recog¬ 
nition  of  the  phoneme.  Consequently  the  resulting  speech  is  not  entirely 
natural.  It  is  readily  intelligible  and  the  rules  do  provide  an  explicit  pro¬ 
cedure  for  converting  a  phonemic  transcription  into  control  signals  for  a 
speech  synthesizer.  In  practice,  they  have  been  used  satisfactorily  not  only 
at  Haskins  Laboratories,  but  also  at  the  University  of  Edinburgh  (9).  At 
the  Bell  Telephone  Laboratories  rules  based  on  those  described  above  have 
been  used  to  program  a  digital  computer,  which  then  generated  control 
signals  and  drove  a  simulated  resonant  speech  synthesizer  (10). 


vA  A=*  - 


I — paint ed-th  F  s  — by— r  u  le-w  ithout-loo  king-at-a 


s  pec  t  ro  gr  a  m -  c  a  n-you-u  n  der  s  t  a  n  d-it? 

Figure  3  An  Early  Example  of  a  Reduced  Spectrogram  Painted  by  Rule 
for  Use  with  the  Pattern  Playback 


- 140  C*S 


Alex  an  der  s-an-i  n  tell  i  gent-conver  s  a  tional  i  s  t. 


Figure  4  A  More  Elaborate  Example  of  a  Reduced  Spectrogram. 
Painted  by  Rule  for  Use  with  the  Voback  Synthesizer  (1).  Note  the  pitch 
instructions  which  provide  an  added  cue  for  intonation. 


302  Man-Machine  Systems 

Finally,  let  us  consider  brietly  the  second  method  of  generating  speech 
under  development  at  Haskins:  speech  by  compilation.  Here  the  first  prob¬ 
lem  is  the  size  of  the  speech  segment  to  be  used:  phoneme,  syllable,  word, 
or  phrase.  The  difficulties  of  using  short  segments  have  been  discussed  by 
several  authors  (7.  13,  15.  18).  We  have  already  remarked  that  the 
phonemic  elements  of  speech,  though  linguistically  discrete,  are  merged 
acoustically  into  higher  order  units.  Just  as  we  cannot  specify  clearly  the 
temporal  boundaries  of  the  phoneme  on  the  spectrogram,  for  example,  so 
too  we  cannot  synthesize  the  smooth  flow  of  speech  from  discrete  pre¬ 
recorded  phonemic  elements.  Peterson  et  al.  (15)  have  attempted  to  by-pass 
this  problem  by  using  phonemic  pairs  or  dyads  that  contain  ‘‘parts  of  two 
phones  with  their  mutual  influence  in  the  middle  of  the  segment."  However, 
the  difficulties  of  matching  the  cut  ends  are  not  eliminated.  Furthermore, 
Sivertsen  (18)  has  shown  that  “the  segment  inventory  becomes  dispro¬ 
portionately  large  when  the  segments  are  not  co-terminous  with  linguistic 
units.”  Similar  problems  are  encountered  with  the  syllable  as  a  prerecorded 
element.  It  would  seem  in  fact  that  words  are  the  smallest  elements  we  can 
hope  to  combine  into  reasonably  natural  cursive  speech. 

Yet  the  use  of  words  is  not  without  difficulties.  Not  least  is  the  instru¬ 
mental  problem  of  how  to  store  and  retrieve  the  necessarily  very  large 
number  of  recordings,  perhaps  as  many  as  10  to  20  thousand  in  a  satis¬ 
factory’  reading  machine  for  the  blind.  By  present-day  techniques  such  a 
number  would  require  a  pair  of  very  large  disc  or  drum  memory  units. 
However,  a  smaller  device  with  a  vocabulary  of  some  7000  words  is  al¬ 
ready  being  developed  at  Haskins  Laboratories  (3).  This  pilot  Word 
Reading  Machine  will  be  used  to  evaluate  the  word  compilation  method 
for  use  in  a  reading  machine  for  the  blind. 

The  most  important  problem  in  developing  the  device  has  been  how  to 
record  a  single  version  of  each  word  that  will  be  acceptable  for  a  variety 
of  syntactical  uses.  The  solution  has  been  to  assign  each  word  to  a  gram¬ 
matical  class  according  to  its  most  frequent  usage,  and  to  assign  to  each 
grammatical  class  an  appropriate  pattern  of  duration,  intensity,  and  pitch 
change.  A  practised  speaker  is  then  instructed  in  terms  that  he  can  apply 
to  the  monitoring  of  his  own  output  and.  after  some  trial  and  error,  a  satis¬ 
factory  recording  is  made  (6).  When  the  separate  recordings  are  combined 
they  yield  an  output  that  is  highly  intelligible  at  slow-to-average  reading 
rates  and  that  approximates  the  phrasal  patterns  of  normal  speech.  Of 
course,  the  quality  would  be  improved  if  the  machine  were  provided  with 
several  versions  of  each  word  and  with  linguistic  rules  and  logical  circuitry 


Psychological  Considerations  303 

to  select  the  version  most  appropriate  to  the  syntactical  context.  With  the 
increase  in  quality  would  come,  however,  a  perhaps  disproportionate  in¬ 
crease  in  complexity  and  cost. 

In  sum,  we  have  argued  that  while  there  is  obvious  value  in  cheap  and 
portable  reading  aids  for  the  personal  use  of  the  blind,  full  access  to  our 
libraries  can  only  be  given  by  a  high  performance  reading  machine  that  will 
enable  the  blind  to  read  at  the  comfortable  rate  of  normal  speech.  The 
auditory  output  of  the  machine  should  be  speech  or  at  the  very  least  speech¬ 
like.  Such  an  output  requires  little  or  no  training  of  the  blind  user.  We  may 
be  confident  for  several  reasons  that  we  shall  find  no  other  auditory  display 
capable  of  transmitting  information  either  as  rapidly  or  as  accurately.  Two 
methods  of  providing  a  speech  display  are  currently  being  developed  at  the 
Haskins  Laboratories:  one  method  synthesizes  speech  from  a  phonemic  in¬ 
put;  the  other  compiles  speech  from  a  dictionary  of  prerecorded  words.  The 
program  of  development  is  still  at  too  early  a  stage  for  a  final  choice  between 
them  to  be  made.  The  hardware  required  to  link  these  outputs  with  the 
printed  page  is  not  yet  available,  but  may  well  be  so  in  the  not  too  distant 
future.  Granted  this,  we  may  reasonably  hope  that  high  performance  read¬ 
ing  machines  with  speech  outputs  will  one  day  be  installed  in  our  public 
libraries  and  educational  institutions. 


REFERENCES 

1.  Borst,  J.  M.,  and  F.  S.  Cooper,  “Speech  Research  Devices  Based  on  a  Channel 

Vocoder,”  J.  Acoust.  Soc.  Amer.,  Vol.  29  (1957),  p.  777 . 

2.  Cooper,  F.  S.,  “Research  on  Reading  Machines  for  the  Blind,”  in  P.  A.  Zahl  (ed.) 

Blindness.  Princeton:  Princeton  University  Press,  1950. 

3.  Cooper,  F.  S.,  “Toward  a  High  Performance  Reading  Machine  for  the  Blind,” 

in  Human  Factors  in  Modern  Technology.  New  York:  McGraw-Hill  (in 
press) 

4.  Cooper,  F.  S.,  A.  L.  Liberman,  and  J.  M.  Borst,  “The  Interconversion  of  Audible 

and  Visible  Patterns  as  a  Basis  for  Research  in  the  Perception  of  Speech,”  Proc. 
Nat.  Acad.  Sci.,  Vol.  37  (1951),  pp.  318-328. 

5.  Eriksen,  C.  W.  Multidimensional  Stimulus  Differences  and  Accuracy  of  Dis¬ 

crimination.  Wright  Air  Development  Center  Tech.  Rep.  54-165,  June  1954. 

6.  Gaitenby,  J.,  “Word-Reading  Device:  Experiments  on  the  Transposability  of 

Spoken  Word,”  J.  Acoust.  Soc.  A mer.,  Vol.  33  (1961),  p.  1664. 

7.  Harris,  K.,  “Study  of  the  Building  Blocks  of  Speech,”  J.  Acoust.  Soc.  Amer.,  Vol. 

25  (1953),  pp.  962-969. 

8.  House,  A.  S.,  K.  N.  Stevens,  T.  T.  Sandel,  and  J.  B.  Arnold,  “On  the  Learning  of 

Speechlike  Vocabularies,”  J.  Verb.  Learn.  Verb.  Behav.,  Vol.  1  (T962),  pp. 
133-143. 

9.  Ingemann,  F.,  “Eight-Parameter  Speech  Synthesis,”  in  Progress  Report.  Edin¬ 

burgh:  University  of  Edinburgh,  1960  (Phonetics  Department). 


304  Man-Machine  Systems 

10.  Kelly,  J.  L.,  and  L.  J.  Gerstman,  “An  Artificial  Talker  Driven  from  a  Phonetic 

Input,”  J.  Acoust  Soc.  Amer.,  Vol.  33  ( 1961 ),  p.  835. 

11.  Klemmer,  E.  T.,  and  F.  C.  Frick,  “Assimilation  of  Information  from  Dot  and 

Matrix  Patterns,”  J.  Exp.  Psychol.,  Vol.  45  ( 1953),  pp.  15-19. 

12.  Liberman,  A.  M.,  F.  S.  Cooper,  K.  S.  Harris,  and  P.  F.  MacNeilage.  “Motor 

Theory  of  Speech  Perception.”  Preprint  for  Speech  Communication  Seminar, 
Stockholm,  1962. 

13.  Liberman,  A.  M.,  F.  Ingemann,  L.  Lisker,  P.  Delattre,  and  F.  S.  Cooper,  “Min¬ 

imal  Rules  for  Synthesizing  Speech,”  J.  Acoust.  Soc.  Amer.,  Vol.  31  (1959),  pp. 
1490-1499. 

14.  Miller,  G.  A.,  “The  Magical  Number  Seven,  Plus-or-Minus  Two,  or.  Some  Limits 

on  Our  Capacity  for  Processing  Information,”  Psychol.  Rev.,  Vol.  63  (1956), 
pp.  81-96. 

15.  Peterson,  G.,  W.  S-Y.  Wang,  and  E.  Sivertsen,  “Segmentation  Techniques  in 

Speech  Synthesis,”  J.  Acoust.  Soc.  Amer.,  Vol.  30  (1958),  pp.  739-742. 

16.  Pollack,  I.,  “The  Information  of  Elementary  Auditory  Displays,”  J.  Acoust.  Soc. 

Amer.,  Vol.  24  (1952),  pp.  745-749. 

17.  Pollack.  I.,  and  L.  Ficks,  “Information  of  Elementary  Multidimensional  Auditory 

Displays,”  /.  A  const.  Soc.  Amer., \  ol.  26  (1954),  pp.  155-158. 

18.  Sivertsen,  E.,  “Segment  Inventories  for  Speech  Synthesis,”  Lang.  Speech,  Vol.  4 

(1961),  pp.  27-61. 

19.  Woodworth,  R.  S.,  and  H.  Schlosberg.  Experimental  Psychology.  New  York: 

Holt,  Rinehart  and  Winston,  1954. 


EXPERIMENTAL  STUDIES  OF  HUMAN 
FACTORS  IN  PERCEPTION  AND  LEARNING 
OF  SPELLED  SPEECH* 

MILTON  F.  METFESSEL 

University  of  Southern  California,  Los  Angeles,  California 


Spelled  speech  consists  of  very  rapid  spelling  in  which  the  letter  sounds 
are  run  closely  together.  For  each  word  this  provides  a  new  pronunciation 
which  resembles  a  foreign  language  word  rather  than  a  sequence  of  letters. 
This  kind  of  spelling  is  in  contrast  to  that  usually  heard,  which  is  well 
illustrated  by  the  procedure  of  a  spelling  bee.  With  the  latter,  which  may  be 
called  discrete  spelling,  each  letter  is  heard  as  a  distinct  unit  separate  from 
those  adjoining  it.  Discrete  spelling  is  a  cumbersome  method  of  communi¬ 
cation. 

Distinction  may  also  be  made  between  natural  and  synthetic  spelling. 
Any  instance  in  which  a  person  spells  out  material  in  either  discrete  or 
spelled  speech  form  would  be  classified  as  natural  spelling.  Each  letter 
is  produced  anew  whenever  it  occurs  in  the  text.  The  speaker  may  vary 
pitch,  intensity,  timbre,  and/or  duration  of  any  letter  from  one  time  to 
another.  Synthetic  spelling  refers  to  taking  a  single  set  of  26  letter  pronunci¬ 
ations  (voice  fragments)  and  putting  them  together  in  various  combi¬ 
nations  to  form  words. 

Synthetic  production  of  discrete  spelling  is  not  difficult,  since  the  various 
letters  do  not  have  to  flow  together  or  coalesce.  To  synthesize  satisfactory 
spelled  speech,  however,  it  is  necessary  to  have  a  set  of  alphabet  sounds 
each  of  which  goes  smoothly  with  every  other  letter  with  which  it  is  com¬ 
bined  in  the  language.  It  is  not  to  be  expected,  even  with  the  best  selection 
of  alphabet  sounds,  that  synthetic  spelled  speech  will  sound  exactly  like 
natural  spelled  speech.  Compromises  are  necessary;  and  the  goal  becomes 
one  of  finding  letter  sounds  which  can  be  used  anywhere  in  a  word  without 
standing  out  as  unsuitable  in  any  position. 

*  This  research  was  carried  out  under  Contract  V1005P-5321  with  the  Veterans 
Administration,  Prosthetic  and  Sensory  Aids  Service. 


305 


306  Man-Machine  Systems 

From  1955  to  1961,  a  major  portion  of  our  research  was  devoted  to  the 
development  of  one  or  more  sets  of  alphabet  sounds  which,  if  used  in  an 
automatic  reader,  would  provide  satisfactory  synthetic  spelled  speech.  More 
than  20  alphabets  were  constructed,  each  one  closer  to  the  goal.  The  pro¬ 
cedure  was  to  have  individuals  practice  rapid  spelling,  record  their  produc¬ 
tions  on  tape,  and  select  letter  sounds  on  the  basis  of  four  criteria: 

( 1 )  easy  identifiability; 

( 2 )  rapid  rate  ( 90  or  more  words  per  minute ) ; 

(3)  acceptable  sound  quality;  and 

(4)  coalescence  (that  is,  smooth  transitions  from  letter  to  letter  so  that 
words  and  phrases  rather  than  separate  letters  become  the  units  of 
perception). 

Early  in  the  research,  alphabets  were  selected  that  met  the  first  three  cri¬ 
teria.  It  was  the  fourth,  that  of  coalescence,  which  proved  difficult  to  meet. 

In  working  with  various  alphabets  much  detailed  information  was  ob¬ 
tained  concerning  characteristics  of  letters  related  to  coalescence.  The 
aesthetic  principle  of  “variety  within  unity”  came  to  be  a  guiding  one  in  the 
selection  of  alphabet  sounds.  It  was  also  discovered  that  letters  with  certain 
phonetic  characteristics  behaved  in  similar  fashion.  If  one  of  these  coalesced 
well  with  other  letters  put  before  or  after  it  the  probability  was  great  that 
other  members  of  the  same  phonetic  group  would  also  show  smooth  transi¬ 
tions  in  the  same  combinations.  These  findings  led  to  development  of  a 
system  for  selecting  alphabet  letters  that  involves  making  use  of  phonetic 
equivalences  to  1 )  reduce  the  number  of  letter  samples  that  need  to  be  con¬ 
sidered  and  2)  predict  what  will  happen  when  various  letters  are  placed  in 
combination.  In  applying  it,  only  those  letter  samples  are  considered  which 
are  short  enough  to  produce  spelled  speech  at  the  desired  rate  and  which 
can  be  discriminated  from  other  letters. 

With  the  use  of  this  system  two  alphabets  regarded  as  satisfactory 
have  been  developed,  one  with  female  voice  and  the  other  with  male  voice. 
In  both  of  them  the  letter  lengths  are  such  that  spelled  speech  at  the  rate  of 
80  to  90  words  a  minute  is  possible. 

Since  the  success  or  failure  of  a  reading  machine  with  spelled  speech 
as  output  will  hinge  on  whether  blind  people  will  take  the  time  to  learn 
spelled  speech,  the  development  of  a  training  program  that  will  make  the 
task  interesting,  challenging,  and  rewarding  is  of  prime  importance.  Since 
late  February,  1962,  we  have  been  engaged  in  investigating  the  best  way 
to  program  such  training  and  in  trying  out  a  number  of  procedures  for 


Experimental  Studies  307 

presenting  the  material.  All  the  material  used  in  the  sessions  has  been  syn¬ 
thetic  spelled  speech  (made  by  splicing  together  magnetic  tape  containing 
the  26  individual  letters  of  our  alphabets  or  combinations  made  from  them). 
During  the  course  of  the  training  the  subjects  have  been  asked  to  identify 
single  words,  groups  of  words,  and  complete  sentences.  They  have  also 
been  asked  to  give  a  meaningful  response  to  material  presented  in  spelled 
speech  by  speaking  the  opposite  of  the  word  spelled,  by  answering  ques¬ 
tions  spelled  to  them,  by  completing  spelled  sentences,  and  by  telling  the 
information  conveyed  in  paragraphs  consisting  of  a  number  of  sentences. 

The  trainees  in  these  sessions  have  been  25  college  students.  Sighted 
rather  than  blind  persons  have  been  used  in  these  pilot  studies,  preparatory 
to  training  blind  students  during  the  coming  year. 

Many  of  the  procedures  incorporated  into  the  training  were  based  on 
suggestions  from  earlier  studies  made  with  natural  spelled  speech.  These 
included  the  following:  (1)  individual  rather  than  group  training;  (2)  re¬ 
cording  by  means  of  taping  each  session  in  its  entirety;  (3)  presentation  of 
material  in  such  fashion  as  to  give  immediate  knowledge  of  results;  (4)  re¬ 
stricting  alternatives  for  the  subject  in  the  early  stages  of  training  by  the 
kinds  of  instructions  used  and  the  sequences  of  items  presented;  and  (5) 
participant  control  of  the  presentation  of  the  material.  The  last  of  these  was 
present  to  a  limited  extent  in  these  training  sessions  and  is  to  be  extended 
in  the  future  with  use  of  a  teaching  machine  currently  under  construction 
and  suitable  for  blind  as  well  as  sighted  subjects. 

Fundamental  in  connection  with  spelled  speech  is  the  idea  that  the 
rapidly  spelled  words,  with  the  letters  flowing  together,  can  be  perceived 
as  configurations.  In  presenting  the  synthetic  spelling  to  the  trainees  each 
word  was  given  in  its  spelled  speech  form  from  the  start,  rather  than  begin¬ 
ning  with  relatively  slow,  discrete  spelling  and  working  up  gradually  to  the 
spelled  speech  form.  That  it  is  unnecessary  to  start  with  discrete  spelling  is 
supported  by  the  finding  that  subjects  during  their  first  session  were  able 
to  identify  words  correctly  in  spelled  speech  form.  From  the  standpoint  of 
the  goal  of  learning  this  was  considered  important,  since  starting  with  dis¬ 
crete  spelling  would  emphasize  perception  on  an  individual  letter  basis  and 
would  delay  initial  experience  with  the  new  pronunciation  eventually  to  be 
learned. 

The  training  sessions  with  synthetic  spelled  speech  have,  like  the  earlier 
ones  with  natural  spelled  speech,  provided  evidence  that  with  relatively 
short  periods  of  training  individuals  are  able  to  respond  to  meaning  of 
material  given  in  spelled  speech.  They  have  also  yielded  data  on  marked 


308  Man-Machine  Systems 

improvement  in  accuracy  and  speed  of  response  with  practice.  In  addi¬ 
tion,  they  have  given  information  relating  to  a  number  of  important  points 
in  connection  with  the  programming  of  training: 

1 .  Some  words  are  much  more  easily  grasped  when  heard  the  first  time 
than  others.  This  is  not  merely  a  matter  of  length,  for  in  many  instances 
long  words  are  perceived  more  easily  than  short  ones.  A  start  has  been  made 
on  the  identification  of  the  characteristics  of  words  that  make  them  difficult 
to  perceive  so  that  vocabulary-building  sessions  will  be  able  to  concentrate 
on  troublesome  items. 

2.  To  some  extent  provisions  will  need  to  be  made  for  individualized 
vocabulary  building,  since  individuals  encounter  different  letter  problems. 
Most  effective  training  of  blind  people  will  involve  diagnosing  in  early 
sessions  which,  if  any,  letter  distinctions  are  troublesome  for  the  person  and 
giving  special  training  with  respect  to  them. 

3.  Context  helps  markedly  in  decreasing  letter  problems,  although 
it  does  not  eliminate  them  completely. 

4.  Not  only  do  marked  individual  differences  exist  in  initial  response 
to  spelled  speech,  but  these  initial  differences  provide  considerable  pre¬ 
dictability  with  respect  to  later  performance. 

5.  The  training  sessions  have  raised  interesting  questions  with  respect 
to  the  evaluation  of  the  rate  of  spelled  speech  which  individuals  are  able  to 
handle.  A  distinction  needs  to  be  made  between  comprehension  time  and 
communication  time.  In  testing  the  efficiency  of  a  training  program  com¬ 
munication  time  necessarily  is  stressed  more  than  would  be  the  case  with 
the  blind  users  of  an  automatic  reader,  for  whom  this  preparatory  experi¬ 
mental  work  is  being  done. 


TACTUAL-KINESTHETIC  PERCEPTION 


OF  INFORMATION 

JAMES  C.  BLISS 

Stanford  Research  Institute,  Menlo  Park,  California 


INTRODUCTION 

There  are  several  indications  that  the  tactile  and  kinesthetic  senses  are 
capable  of  information  rates  approaching  those  of  our  visual  and  auditory 
senses.  One  of  these  is  the  ability  of  a  deaf-blind  person  to  perceive  speech 
by  placing  his  fingers  on  the  lips,  jaw,  and  throat  of  the  speaker.  In  this 
way  speech  can  be  understood  at  its  normal  rate.  Another  indication  of  the 
capabilities  of  the  tactile  and  kinesthetic  senses  as  information  channels 
is  the  ability  of  the  deaf-blind  to  communicate  among  themselves  by  means 
of  a  finger  language.  While  the  visual  sense  is  capable  of  a  continuous  in¬ 
formation  rate  of  about  40  bits/sec,  it  appears  that  the  tactile  and  kines¬ 
thetic  senses  may  be  able  to  achieve  an  information  rate  of  at  least  20 
bits/sec. 

In  both  of  these  examples  of  communication  achievements  of  the  deaf- 
blind,  the  form  of  the  information  bearing  stimuli  are  limited  and  there  is 
no  reason  to  assume  they  are  optimum.  Speech  evolved  to  fit  the  capabilities 
of  the  vocal  organs  and  the  ear  and  sign  language  evolved  to  match  the 
capabilities  of  the  fingers  and  the  eye.  There  is  no  reason  to  assume  that 
either  speech  or  sign  language  is  the  ideal  language  for  the  tactile  and 
kinesthetic  senses. 

The  kinesthetic  and  tactile  senses  offer  many  possibilities  for  informa¬ 
tion  transfer.  These  senses  respond  to  a  wide  range  of  stimuli,  e.g.,  mechani¬ 
cal,  thermal,  electrical,  and  chemical.  Type  of  stimulation,  location,  intensity, 
and  time  can  each  be  used  as  information  bearing  elements.  Moreover,  in¬ 
formation  can  be  acquired  actively  or  passively  by  the  observer.  For  ex¬ 
ample,  when  information  is  handwritten  on  the  skin  of  a  person,  he  is  a 
passive  receiver,  whereas  in  reading  braille  he  receives  information  by  active 
movements  of  his  fingers  across  embossed  dots. 

In  spite  of  this  apparent  potentiality,  I  know  of  no  machine  from  which 


309 


310  Man-Machine  Systems 

a  subject  can  acquire  information  via  the  tactile  and  kinesthetic  senses  at 
a  rate  approaching  speech.  Numerous  investigators  have  experimented  with 
tactile  communication  systems.  However,  even  though  the  kinesthetic  sense 
is  a  major  information  channel  for  the  blind  in  both  reading  and  mobility, 
few  attempts  have  been  made  to  couple  this  sense  directly  to  a  mechanical 
information  display.  The  first  communication  system  described  below  was 
designed  to  stimulate  the  kinesthetic  sense  primarily. 

SYSTEM  CONSIDERATIONS 

A  general  system  model  for  machine  aided  communication  is  shown  in 
Figure  1.  The  source  represents  any  place  from  which  information  is  to  be 
obtained,  such  as  the  visual  world,  sound,  or  a  book.  The  role  of  the  sensor 
is  to  acquire  information  from  the  source  and  transduce  it  into  a  form  suit- 


Figure  1  A  System  Model  for  a  Machine  Communication  Aid 

able  for  processing.  The  processor  may  perform  storage,  information  fil¬ 
tering,  recognition,  or  encoding.  The  display  block  presents  information  to 
the  human  being,  who  is  divided  into  sensory  channels  and  user.  The 
former  represents  the  sensory  transformations  which  take  place  in  the 
nervous  system,  the  latter  the  ultimate  receiver  of  the  information. 

An  important  aspect  of  this  model  is  user  control.  By  way  of  control 
links  to  the  sensor,  processor,  and  display  information  can  flow  from  the 
user  to  the  system  so  that  the  system  can  adapt  or  be  programmed  to  satisfy 
the  requirements  of  the  user.  The  importance  of  this  control  in  learning 
to  receive  a  high  rate  of  information  through  complex  atypical  stimulation 
is  implied  by  the  experiments  of  Held  (4).  He  has  shown  that  compensation 
for  various  optically  produced  rearrangements  requires  stimulation  that 
changes  as  a  consequence  of  the  movements  of  the  observer. 


T actual-Kinesthetic  Perception  311 

Psychological  data  indicate  that  complex  stimuli  are  needed  to  obtain 
a  high  information  rate.  For  example,  the  experiments  of  Eriksen  (3)  imply 
that  the  rate  at  which  man  can  receive  information  increases  with  the  dimen¬ 
sionality  of  the  observations,  even  though  changes  along  different  dimen¬ 
sions  are  perfectly  correlated  with  one  another  so  that  any  change  in  one 
always  involves  a  change  in  the  others.  Furthermore,  Sumby  and  Pollack 
(6)  have  shown  that  the  rate  of  continuous  information  transmission  in¬ 
creases  with  the  number  of  message  alternatives. 

Two  hypothetical  systems  for  obtaining  high  information  rates  are  (1) 
methods  in  which  the  source  messages  are  coded  by  the  system  into  “equal 
information  units”  and  presented  one  at  a  time  to  the  user;  and  (2)  meth¬ 
ods  in  which  a  temporal  or  spatial  display  of  the  source  information  permits 
the  user  to  code  the  messages  into  “equal  information  units.”  In  the  second 
system  the  control  links  discussed  above  play  an  important  role.  Both  of 
these  systems  attempt  to  convert  a  source  which  generates  information  at  a 
nonuniform  rate  into  a  constant  information  rate  system. 

The  first  system  is  based  on  an  analogy  with  language  units  which 
closely  resemble  perceptual  units,  such  as  stenography,  Grade  2  braille, 
Japanese  kana,  and  morphemes.  These  language  forms  have  a  more  uni¬ 
form  information  distribution  among  their  basic  units  than  the  English 
alphabet.  For  example,  Grade  2  braille  contains  180  contractions  of  the 
more  frequent  words  and  letter  combinations  in  English,  so  that  in  the  trans¬ 
formation  from  English  letters  to  braille  cells  there  is  a  marked  increase  in 
the  uniformity  of  information  per  symbol.  Thus  the  information  is  in  ap¬ 
proximately  equal  information  units  at  point  A  in  the  model  of  Figure  1 . 

This  method  has  a  disadvantage,  however,  which  could  possibly  limit 
its  information  rate,  especially  if  the  system  is  in  widespread  usage.  In¬ 
formation  units  which  are  psychologically  equal  depend  on  the  user’s  ex¬ 
pectations  or  uncertainties  and  not  on  the  true  probability  distributions. 
The  user’s  expectations  are  certainly  not  constant  in  time  or  among  different 
people.  Therefore,  unless  the  encoder  has  the  ability  to  determine  the  user’s 
expectations  and  then  modify  the  code  in  an  understandable  fashion  it  can 
at  best  only  approximate  a  constant  information  rate.  For  example,  Grade 
2  braille  only  approximates  a  constant  amount  of  information  per  cell  in 
the  interest  of  having  a  standard  code  with  some  degree  of  permanence. 

The  second  hypothetical  system  for  obtaining  a  high  information  rate  is 
analogous  to  visual  reading.  When  a  person  reads  his  eyes  move  in  saccadic 
jumps  to  various  fixation  points.  The  length  of  these  jumps  and  the  time 
spent  at  each  fixation  point  varies  from  jump  to  jump. 


312  Man-Machine  Systems 

Cherry  (2)  suggests  that  in  reading  the  length  of  these  saccadic  jumps 
and  the  fixation  times  may  be  related  to  the  psychological  information  con¬ 
tent  of  the  text.  It  may  be  that  one  purpose  of  these  saccadic  eye  movements 
is  to  group  the  printed  letters  and  words  into  more  or  less  equal  information 
blocks.  In  this  way  a  fairly  constant  rate  of  perceptual  information  intake  is 
obtained,  even  though  individual  letters  and  words  vary  greatly  in  the  in¬ 
formation  they  carry.  In  fact,  the  reason  written  language  evolved  into  a 
form  in  which  equal  lengths  of  symbols  do  not  carry  equal  information  may 
be  because  the  spatial  aspect  of  vision  is  suited  to  grouping  information 
into  equal  units,  thus  permitting  the  length  of  a  sequence  of  symbols  to 
be  another  stimulus  dimension  for  coding. 

For  this  method  in  which  equal  information  encoding  is  done  by  the 
user,  the  information  must  be  displayed  in  such  a  way  that  it  can  be  easily 
grouped  and  organized.  Thus,  at  point  B  in  the  model  of  Figure  1,  the 
information  is  in  approximately  equal  units.  The  advantage  of  this  method 
over  the  method  in  which  the  encoding  is  done  automatically  is  that  the 
psychological  expectations  are  used  in  the  encoding. 

A  Kinesthetic-Tactile  Communication  System 

The  kinesthetic-tactile  display  to  be  described  here  was  designed  to  trans¬ 
mit  English  text.  The  input  to  this  display  could  be  the  output  of  a  char¬ 
acter  reading  machine,  a  teletypesetter  tape,  or  an  electric  typewriter.  In 
this  research  study,  however,  punched  paper  tape  was  used  as  the  informa¬ 
tion  carrying  medium. 

Since  the  fingers  have  many  degrees  of  freedom,  are  heavily  innervated, 
and  have  a  large  representation  in  the  cortex,  passive  movement  of  the 
fingers  was  chosen  as  an  information  carrying  stimulus  in  this  display. 
These  movements  are  obtained  from  a  set  of  eight  finger  rests  which  can 
be  automatically  positioned  in  three-dimensional  space  according  to  a  pre¬ 
arranged  program. 

The  display  is  powered  by  compressed  air  and  a  pneumatic  paper  tape 
reader  is  used  to  valve  the  air  pressure  to  specific  brass  Sylphon  bellows. 
These  bellows,  in  turn,  move  the  finger  rests.  Figure  2  shows  the  basic 
system  used  for  each  finger  rest.  The  three  bellows  assemblies  have  mutually 
perpendicular  axes.  With  this  basic  unit  3s  =  27  positions  of  the  finger  rest 
are  possible. 

The  complete  system  with  eight  finger  rests  is  shown  in  Figure  3.  This 
display  has  3U  =  2.8  X  10n  possible  states,  which  gives  considerable  flex¬ 
ibility  for  experimentation  with  complex  stimuli.  An  attempt  was  made 


T actual-Kinesthetic  Perception  313 

to  enhance  the  tactile  cues  from  the  relative  motion  between  the  fingers 
and  the  finger  rests  by  making  half  of  each  finger  rest  have  a  different  tex¬ 
ture.  Moreover,  an  edge  separates  each  half  of  a  finger  rest  so  that  a  tactile 
reference  direction  is  given  to  aid  the  user  in  orientation. 

There  are  many  ways  in  which  the  pneumatic  paper  tape  reader  can  be 
used  to  program  information  presentations  on  the  finger  stimulator.  Since 
each  finger  rest  can  be  in  any  one  of  27  stimulus  states  at  each  instant  of 

Y  BELLOWS 


Figure  2  Schematic  Diagram  of  a  Single  Finger  Stimulator 


Figure  3  The  Complete  System  with  Eight  Finger  Rests 


314  Man-Machine  Systems 

time,  any  possible  presentation  can  be  described  in  terms  of  a  location- 
time-stimulus  coordinate  system.  The  essential  difference  between  the  var¬ 
ious  presentations  is  the  way  in  which  the  location  aspect  of  the  kinesthetic 
sense  is  utilized.  In  one  presentation  (traveling  wave),  the  location  of  the 
stimulus  is  merely  used  as  an  aid  to  memory;  that  is,  the  information  is  re¬ 
peated  at  different  locations,  but  the  temporal  order  in  which  the  locations 
are  stimulated  indicates  the  sequential  order  of  the  information.  In  another 
presentation  (typewriter),  location  is  used  directly  as  an  information  carry¬ 
ing  dimension.  We  shall  discuss  only  the  traveling  wave  and  typewriter 
presentations. 

The  Traveling  Wave  Presentation 

In  this  presentation  each  movement  is  repeated  in  sequence  at  all  of  the 
finger  rests.  Thus  a  three-dimensional  traveling  wave  moves  across  the  dis¬ 
play.  This  display  method  is  analogous  to  the  news  presentation  in  Times 
Square,  New  York  City. 

The  advantage  of  the  traveling  wave  presentation  is  that  it  attempts 
to  use  the  location  aspect  of  the  kinesthetic  and  tactile  senses  to  allow  the 
user  to  approximate  the  spatial  aspect  of  vision.  The  user  can  see  more 
than  one  letter  at  a  time.  He  can  attempt  to  fixate  at  one  finger  location,  yet 
have  some  peripheral  indication  of  letters  ahead  and  behind  the  fixation 
point.  In  this  way  it  might  be  possible  for  a  subject  to  learn  to  read  words 
instead  of  letters. 

A  code  was  written  for  the  letters  of  the  alphabet,  numbers,  and  symbols 
into  a  traveling  wave  presentation.  Only  six  directions  of  movement  are 
used  in  this  code.  Two  successive  movements  of  one  finger  are  used  to 
represent  an  alphanumeral;  thus  there  are  36  distinguishable  pairs  of  move¬ 
ments  and  6  distinguishable  single  movements,  giving  a  total  of  42  possible 
stimuli. 

Several  experiments  were  performed  with  the  traveling  wave  presenta¬ 
tion.  In  the  first  experiment  two  subjects  learned  the  code  for  the  six  letters 
e,  t,  o,  a,  n,  i,  and  an  attempt  was  made  to  see  if  they  could  learn  to  recog¬ 
nize  words  composed  of  these  letters.  Tapes  were  prepared  with  words  in 
random  order  from  a  list  of  about  60  words  containing  only  these  six  letters. 
Figure  4  shows  the  learning  curves  obtained  from  this  experiment.  Only  one 
hand  was  used  for  these  trials. 

It  was  felt  by  the  subjects  in  this  experiment  that  the  repetition  and  the 
presentation  of  more  than  one  letter  at  a  time  caused  confusion.  It  seemed 
by  introspection  that  the  presentation  of  two  letters  simultaneously  re- 


T actual-Kinesthetic  Perception  315 

suited  in  a  sensation  that  was  not  easily  recognized  as  the  sum  of  the  sensa¬ 
tions  received  when  each  letter  was  presented  alone. 

In  order  to  determine  the  effect  of  the  space  between  letters  on  in¬ 
formation  transmission,  two  tapes  of  random  words  composed  of  the  letters 
e,  t,  o,  a,  n,  and  i  were  prepared.  The  letters  of  the  words  on  one  of  the  tapes 
were  spaced  so  that  there  was  one  unstimulated  finger  between  the  letters; 


Figure  4  Learning  Curves  for  the  Traveling  Wave  Display 

on  the  other  tape  the  letter  spacing  was  such  that  two  unstimulated  fingers 
were  between  letters.  Both  tapes  were  presented  to  three  subjects  at  the 
same  letter  speed  (0.8  letters/second).  Only  one  hand  was  used  in  this  ex¬ 
periment.  The  fact  that  fewer  mistakes  were  made  when  the  letters  were 
more  widely  separated  confirmed  the  “confusion  effect”  observed  previously 
by  the  subjects. 

In  the  third  experiment  with  the  traveling  wave  presentation,  42  random 
triplets  composed  of  the  letters  e,  t,  n,  o,  a,  and  i  were  presented  to  eight 
subjects,  using  a  single  hand  in  each  case.  A  multivariate  information  anal¬ 
ysis  of  the  responses  was  made  in  order  to  separate  the  effects  of  letter 
position  in  the  triplet,  individual  differences,  and  movement  directions.  The 
time  for  each  triplet  was  IV2  seconds  and  the  time  between  triplets  was  as 
long  as  the  subject  needed  to  resond  (about  5  seconds) . 

Table  1  gives  the  results  of  this  experiment  in  terms  of  these  measures. 
An  information  rate  of  2.58  bits/letter  was  the  maximum  possible  in  this 


316  Man-Machine  Systems 

experiment.  In  explaining  these  results,  it  is  helpful  to  recall  the  location¬ 
time  plane  for  this  presentation.  The  second  letter  of  the  triplet  never  occurs 
alone,  while  the  first  and  third  letters  occur  alone  twice.  Therefore,  it  is 
interesting  to  note  that  the  average  transmitted  information  for  the  letter 
position  2,  TPo0(s;r),  was  only  about  70  percent  of  the  transmitted  informa¬ 
tion  for  the  letter  position  1,  TPl0(s;r),  or  the  letter  position  3,  TP3o(s;r). 

TABLE  1 

TRANSMITTED  INFORMATION  FOR  THE 
TRAVELING  WAVE  PRESENTATION 


Subject 

j 

TPl0j(s\  r ) 

T„a0fCs;  r) 

Tpa0jCs;  r) 

T0i(s\  r ) 

1 

1  .78 

1  .25 

1  .67 

1 .30 

2 

1  .58 

1 .49 

1  .39 

1 .16 

3 

1 .81 

1  .33 

0.99 

1  .16 

4 

1 .72 

0.79 

1.41 

0.89 

5 

0.96 

0.61 

0.61 

0.39 

6 

0.82 

0.75 

1 .40 

0.74 

7 

0.63 

0.63 

1  .40 

0.64 

8 

1  .23 

0.58 

1  .24 

0.54 

Averages 

1 .31 

0.93 

1  .26 

0.85 

This  result  implies  that  when  two  letters  are  presented  at  once  the  informa¬ 
tion  rate  per  letter  is  significantly  reduced. 


The  Typewriter  Presentation 

In  the  typewriter  presentation  the  subject’s  fingers  are  moved  just  as  he 
would  move  them  if  he  were  typing.  That  is,  finger  movements  in  the  vertical 
direction  indicate  the  “home”  keys  (a,  s,  d,  f,  j,  k,  1);  finger  movements 
away  from  the  body  indicate  the  characters  corresponding  to  the  third  row 
on  a  typewriter;  and  movements  toward  the  body  indicate  the  bottom  row 
characters.  In  addition,  the  following  modifications  are  used: 

(1)  The  “home”  characters  are  indicated  as  follows:  a-down,  s-up, 
d-down,  f-up,  j-down,  k-up,  1-down.  (Since  more  errors  in  locali¬ 
zation  result  from  confusion  between  adjacent  fingers  (1)  this 
arrangement  was  chosen  to  make  adjacent  characters  differ  more.) 

(2)  The  space  bar  is  indicated  by  a  simultaneous  movement  in  the  * 
direction  of  the  middle  fingers  on  both  hands. 

(3)  The  numbers,  which  are  on  the  fourth  row  of  a  typewriter,  are  in- 


Tactual-Kinesthetic  Perception  317 

dicated  by  a  simultaneous  movement  of  the  little  finger  on  the  hand 
opposite  to  that  corresponding  to  the  number  being  sent. 

(4)  Lower  case  and  upper  case  are  indicated  by  the  simultaneous 
movement  of  three  fingers. 

Three  experiments  were  performed  with  the  typewriter  presentation.  In 
one  phase  of  experimentation,  a  blind  subject  spent  approximately  two 
hours  a  week  for  two  months  practicing  with  this  display.  This  training 
period  is  described  below. 

In  the  first  practice  period  the  subject  became  familiar  with  the  machine 
and  the  stimuli  for  the  letters,  numbers,  and  symbols.  Since  she  already 
knew  how  to  type  the  amount  of  learning  required  was  slight.  In  the  second 
practice  period  a  list  of  the  most  frequently  used  English  words  was  pre¬ 
sented  and  by  the  end  of  the  hour  these  could  all  be  recognized  correctly 
at  a  rate  of  about  5  words  per  minute.  In  the  succeeding  periods  the  subject 
practiced  with  material  from  a  fourth  grade  reader  and  the  Reader's  Digest. 
After  about  6  one-hour  sessions,  the  subject  was  reading  this  material  at  a 
rate  of  approximately  10  words  per  minute. 

At  this  point  some  improvements  were  made  on  the  system.  Wind  and 
rewind  tape  reels  were  added  to  the  punched  paper  tape  reader  with  controls 
that  could  be  operated  by  the  subject.  These  controls  consisted  of  micro¬ 
switches  so  placed  that  the  subject  could  operate  them  with  his  thumbs. 
Pressing  with  the  right  thumb  stopped  the  display  and  pressing  with  both 
thumbs  reversed  the  tape.  In  this  way,  if  a  letter  or  word  was  missed  it 
could  be  repeated.  Also,  the  start/stop  switch  controlled  with  the  right 
thumb  served  as  a  rudimentary  speed  control. 

With  the  addition  of  these  controls  there  was  an  almost  immediate  in¬ 
crease  in  word  rate  to  about  15  words  per  minute.  However,  due  to  the  way 
this  presentation  was  programmed,  the  air  pressure  was  on  only  about  one- 
eighth  of  the  time.  This  resulted  in  a  decrease  in  the  amplitude  of  motion  at 
speeds  greater  than  15  words  per  minute;  therefore,  this  reading  rate  was 
probably  a  machine  limitation  not  a  subject  limitation. 

In  the  second  experiment  the  42  random  triplets  used  in  the  traveling 
wave  experiment  described  above  were  presented  to  eight  subjects  in  order 
to  obtain  a  comparison  between  the  traveling  wave  and  typewriter  presenta¬ 
tions.  Figure  5  compares  the  results  of  the  two  experiments  on  an  informa¬ 
tion-transmitted  basis.  The  average  transmitted  information  Tp  (s;r)  was 
1.75  bits/letter  for  the  typewriter  presentation  and  0.73  bits/letter  for  the 
traveling  wave  presentation  out  of  a  maximum  possible  2.58  bits/letter. 


318 


Man-Machine  Systems 


f  I  TRAVELING-WAVE 
2  LETTERS/SEC  f 777^  TYPEWRITER 


H(x )  2.58  BITS/LETTER 


Figure  5  Comparison  Between  Performance  Obtained  with  the  Traveling 
Wave  and  Typewriter  Displays 


In  the  third  experiment  30  symbols  (the  alphabet,  comma,  period, 
space,  and  upper  case)  in  typewriter  code  were  presented  in  random  order 
to  one  subject.  Sequences  of  these  symbols  were  generated  with  the  aid  of 
a  random  number  table  so  that  all  symbols  were  equally  probable.  There¬ 
fore,  the  self-information  of  each  symbol  was  4.91  bits.  The  experimental 
stimuli  consisted  of  six  sequences  of  130  symbols  each  and  the  presentation 
rate  of  each  sequence  was  chosen  to  cover  the  range  from  0.5  to  1.5  letters/ 
second.  The  subject  responded  orally  by  naming  the  symbols  as  they  were 
received. 

Figure  6  shows  the  results  of  this  experiment.  The  information  in  the 
correct  responses  was  calculated  by  means  of  the  following  formulas : 


T  = 


percent  corr ect 
~100 


X  4.91  bits/symbol 


p ercent  correct 
100 


X  4.91  X 


presentation  rate  in 
bits/seconds. 


The  maximum  information  rate  obtained  was  4.5  bits/second,  which 


T actual-Kinesthetic  Perception 


319 


PRESENTATION  RATE( SYMBOLS/SEC) 

Figure  6  Information  Rates  Versus  Presentation  Rate  for  the  Typewriter 
Display 


occurred  at  a  presentation  rate  of  1.32  letters/second.  However,  this  rate 
is  probably  not  the  maximum  attainable  for  the  following  reasons: 

(1)  More  practice  would  probably  increase  the  information  rate.  (The 
subject  for  this  experiment  had  less  than  15  hours  of  practice.) 

(2)  The  symbols  in  typewriter  code  are  not  optimally  distributed 
among  the  fingers.  Six  symbols  are  assigned  to  the  index  finger  on 
each  hand,  while  only  three  symbols  are  assigned  to  each  of  the 
other  fingers.  Consequently,  73  percent  of  the  errors  involved  the 
12  letters  assigned  to  the  index  fingers. 


Man-Machine  Systems 

The  information  rate  could  probably  be  increased  if  more  than  30 
alternatives  were  used. 

The  information  rate  could  probably  be  increased  if  the  subject 
were  allowed  to  control  the  speed  continuously. 

Typewriter  code  has  the  disadvantage  of  presenting  only  one 
symbol  at  a  time  and  not  allowing  the  subject  to  “see  ahead.” 

A  COMPLEX  TACTILE  DISPLAY 

Whether  the  second  hypothetical  method  for  obtaining  a  high  information 
rate  (i.e.,  the  subject  codes  the  messages  into  perceptual  units)  is  possible 
may  depend  on  the  extent  to  which  the  skin  can  be  used  to  perform  retinal 
functions  such  as  detailed  responses  to  complex  stimulus  patterns. 

The  use  of  a  large  number  of  tactile  stimulators  presents  control,  power, 
and  reliability  problems  as  well  as  psychological  problems.  Since  ultimately 
something  must  be  moved  to  stimulate  the  skin  (except  for  thermal,  chem¬ 
ical,  and  electrical  stimulation)  data  rates  comparable  to  a  television  picture 
are  difficult  to  obtain.  Much  research  needs  to  be  done  on  an  efficient  stim¬ 
ulator  that  can  be  activated  rapidly. 

Mechanical  stimulators  can  be  divided  into  the  following  categories: 
poke  probe,  vibrator,  and  air  jet.  Poke  probes  have  the  disadvantage  that 
the  sensation  rapidly  damps  out.  Vibration  does  not  adapt  as  rapidly  nor  as 
much  as  static  pressure  and  the  sensation  is  more  desirable.  However,  the 
two  point  limen  for  vibration  is  greater  for  vibration  than  the  static  two  point 
limen. 

In  one  study  Michaels  (5)  investigated  the  possibility  of  using  a  small 
(about  0.020-inch  diameter)  moving  air  jet  as  a  tactile  stimulator.  While 
an  air  jet  is  difficult  to  localize  if  it  is  stationary,  localization  is  much  finer 
if  it  is  moving.  Other  advantages  of  air  jet  stimulation  are  that  direct  connec¬ 
tion  to  the  skin  is  not  necessary  and  that  the  stimulator  can  be  easily  built 
and  controlled. 

Figure  7  shows  how  a  simple  matrix  air  jet  stimulator  could  function.  In 
this  display  device  the  control  is  performed  by  simple  valves  consisting  of 
iron  spheres,  each  of  which  can  be  moved  away  from  an  orifice  by  a  small 
electromagnet. 

Two  effects  that  have  been  found  with  air  jet  stimulation  are  apparent 
location  and  apparent  motion.  Apparent  location  is  the  effect  that  occurs  on 
some  parts  of  the  body  in  which  two  or  more  air  jets  are  felt  as  one.  The 
location  of  this  single  sensation  can  be  moved  around  by  varying  the  relative 


(3) 

(4) 

(5) 


Tactual-Kinesthetic  Perception  321 

air  pressure  among  the  stimulators.  Apparent  motion  occurs  when  bursts 
of  air  from  two  or  more  stimulators  stimulate  the  skin  with  different  times 
of  onset. 

Mr.  John  Eige  is  developing  pneumatic  logic  elements  at  Stanford  Re¬ 
search  Institute  which  may  prove  useful  in  the  design  of  an  air  jet  tactile 


SOLENOID 


AIR 

SUPPLY 


Figure  7  Operation  of 
a  Simple  Matrix  Air  Jet 
Stimulator 


display.  Figure  8  shows  one  basic  logic  element  called  the  “inhibitor,”  which 
is  simple,  inexpensive  to  build,  and  extremely  reliable.  When  line  pressure 
is  applied  at  A  but  not  at  B  the  two  balls  in  Figure  8  will  be  in  the  position 
shown,  so  that  line  pressure  will  also  be  at  C.  However,  if  line  pressure  is 
applied  at  B,  then  no  output  will  occur  at  C.  (Since  the  balls  have  different 
areas  the  air  pressure  at  B  may  actually  be  less  than  line  pressure.)  Be¬ 
cause  of  the  gain  in  this  element  an  oscillator  can  be  designed  to  give  tactile 
vibrations  directly. 

Also,  a  paper  tape  reader  and  decoder  can  be  made  using  only  these 
elements.  Figure  9  shows  how  this  can  be  achieved  for  two  hole  tape  and 
Figure  10  shows  a  physical  realization  of  this  system. 


EXHAUST  PSUPPLY 


BASIC  PNEUMATIC  LOGIC  ELEMENT  OSCILLATOR 

RA-  646,521-79 

Figure  8  The  Inhibitor — A  Simple,  Inexpensive,  and  Reliable  Logic 
Element 


Figure  9  A  Paper  Tape  Reader  for  Two  Hole  Tape  Based  on  a  Pneumatic 
Logic  Element 


Tactual-Kinesthetic  Perception  323 


Figure  10  Physical  Realization  of  a  Two  Hole  Tape  Reader  and  Decoder 
Based  on  the  Inhibitor 


REFERENCES 

1.  Bliss,  J.  C.,  and  R.  J.  Massa,  “A  Visual  and  a  Kinesthetic-Tactile  Experiment  in 

Pattern  Recognition,”  MIT  Quart.  Prog.  Rep.,  No.  61  (April  1961),  pp.  253-260 
(Research  Laboratory  of  Electronics). 

2.  Cherry,  C.  On  Human  Communication.  New  York:  Technology  Press  and  J. 

Wiley,  1957 

3.  Eriksen,  C.  W.,  “Multidimensional  Stimulus  Differences  and  Accuracy  of  Dis¬ 

crimination,”  USAF  WADC  Tech.  Rep.,  1954,  pp.  54-165. 

4.  Held,  R.,  “Exposure-History  as  a  Factor  in  Maintaining  Stability  of  Perception 

and  Coordination,”  J.  Nerv.  Ment.  Dis.,  Vol.  132  (1961),  pp.  26-32. 

5.  Michaels,  S.  B.  “An  Investigation  of  a  Tactual  Method  of  Information  Transfer.” 

Unpublished  thesis,  Massachusetts  Institute  of  Technology,  1961. 

6.  Sumby,  W.  H.,  and  I.  Pollack,  “Short-Time  Processing  of  Information,”  HFORL 

Report,  No.  TR-54-6,  January  1954. 


POSSIBLE  USES  OF 
A  PRINTED  BRAILLE  READER 
WITH  SPELLED  SPEECH  OUTPUT* 

MICHAEL  P.  BEDDOES 

University  of  British  Columbia,  Vancouver,  Canada 


INTRODUCTION 

For  the  sighted  person  reading  is  effortless  and  fast.  The  same  situation  is 
to  be  aimed  for  with  reading  machines  for  the  blind  (5).  It  is  unfortunately 
true  that  the  most  successful  reading  machine  designed  in  the  past,  the 
Optophone  ( 1 ) ,  is  anything  but  easy  to  read  with  and  can  be  used  by  only 
a  few  very  gifted  people. 

Every  blind  reading  machine  must  include  a  pickup  device  for  ob¬ 
taining  information  from  the  printed  page;  signals  from  this  device  are 
generally  processed  before  actuating  a  transducer  which  converts  the  signals 
into  forms  which  can  be  assimilated  by  the  blind  user.  In  designing  a  com¬ 
paratively  simple  machine  too  much  strain  can  be  put  on  the  subject’s 
powers  of  interpretation.  This  paper  deals  with  two  problems  connected 
with  a  more  complex  type  of  machine,  a  letter  recognition  machine,  in 
wrhich  operation  is  letter  by  letter  (as  in  the  Optophone),  but  whose  final 
(audible)  output  is  much  more  readily  interpreted. 

Purely  automatic  means  can  be  found  for  moving  the  pickup  device 
along  each  line  to  be  read.  It  is  possible,  however,  to  use  hand  positioning. 
Recent  experiments  with  the  Battelle  Optophone  (5)  have  shown  that  very 
slow  reading  rates  (of  about  18  words  a  minute)  are  possible  this  way. 
Higher  rates  are  possible  using  clues  of  position  delivered  tactually  through 
buzzers  to  the  finger.  Experiments  with  such  a  method  will  be  described. 

Audible  letter  codes  can  be  used  for  conveying  the  information  to  the 
blind  person.  The  Optophone  code,  tonal  Morse  (2),  and  spelled  speech 

*  The  financial  support  of  the  National  Research  Council  of  Canada  is  gratefully 
acknowledged.  The  experimental  work  was  done  at  the  University  of  British  Columbia 
and  thanks  are  due  to  graduate  student  J.  Papsdorf  and  research  assistant  T.  Orme 
for  their  assistance  in  the  experimental  work. 


325 


326  Man-Machine  Systems 

are  all  possible.  The  latter  has  the  advantage  that  usually  very  little  learning 
need  be  undertaken  for  its  mastery  apart  from  the  learning  received  during 
childhood,  when  spelling  is  taught  through  the  use  of  natural  spelled  speech. 
Naturally  produced  spelled  speech  is  limited  to  about  50  words  per  minute. 
The  limitation  is  mainly  vocal.  When  the  spelled  speech  rate  is  artificially  in¬ 
creased  above  50  words  a  minute  comprehension  is  easy  after  a  small  adjust¬ 
ment  period.  Metfessel  (6)  originated  the  notion  of  such  a  code.  This  paper 
describes  a  method  of  artificially  increasing  the  rate  of  spelled  speech.  The 
method  is  different  from  Metfessel’s  and  a  speed  of  110  words  per  minute 
has  been  readily  attained  by  blind  subjects. 

Our  experiments  with  this  method  throw  some  light  on  the  perceptual 
mechanism  by  which  spelled  speech  characters  are  identified  by  the  ear. 

A  printed  braille  reader  was  constructed  in  order  to  demonstrate  the 
performance  when  probe  positioning  and  spelled  speech  decoding  are 
simultaneously  performed  by  the  human  operator.  This  machine  operates 
with  printed  braille  over  which  the  pickup  device  is  moved  by  the  subject’s 
hand.  The  output  code  is  spelled  speech  delivered  by  headphones  to  the 
subject’s  ears.  The  spelled  speech  waveforms  are  stored  on  magnetic  tape 
in  the  machine.  No  test  data  have  yet  been  obtained  with  this  machine. 

The  problem  of  identifying  letters  from  normally  printed  pages  is  not 
considered. 

EXPERIMENTS  WITH  POSITIONING 
A  HAND-HELD  PICKUP  DEVICE 

Apparatus  for  investigating,  as  a  separate  issue,  the  problem  of  positioning 
a  hand-held  pickup  device  is  shown  in  Figure  1.  Tactile  presentation  of 
position  clues  was  chosen  in  order  to  separate  the  function  of  positioning 
from  the  function  of  decoding  the  spelled  speech,  which  uses,  so  to  speak, 
another  communication  channel.  Two  buzzers  and  two  photodiodes  are 
situated  on  the  pickup  device  (see  Figure  2).  The  buzzers  were  separately 
controlled  by  signals  from  transistor  units,  each  unit  being  driven  by  one 
photodiode.  The  buzz  frequency  was  60  cps  and  the  mechanical  move¬ 
ment  was  conveyed  to  the  finger  tips. 

A  photographic  negative  simulated  the  printed  page  and  was  illu¬ 
minated  from  underneath.  The  negative  was  black  over  most  of  its  area 
with  transparent  slits  simulating  the  effect  of  lines  of  print.  Figure  3  shows 
the  simplest  type  of  negative  used.  When  the  photodiodes  were  masked 
by  this  negative  from  the  light  source  the  buzzers  did  not  operate,  but 


Braille  Reader  with  Spelled  Speech 


327 


Figure  1  Pantograph  for  Testing  Location 


Figure  2  Photocell  and  Buz¬ 
zer  Assembly  of  Pantograph 
Device  of  Figure  1 


328  Man-Machine  Systems 

when  a  photodiode  was  directly  over  one  of  the  slits  the  corresponding 
buzzer  vibrated. 

The  buzzers  indicated  errors  of  position.  One  indicated  that  the  probe 
was  above  the  line  and  should  be  pulled  towards  the  operator;  the  other 
had  the  opposite  effect.  When  reading  along  a  line  of  print  with  no  ‘vertical’ 
errors  no  buzzers  sounded.  A  pantograph  linkage  (see  Figure  1)  enabled  a 
record  to  be  obtained  of  the  probe’s  movement. 

Experimental  Procedure 

Two  situations  are  envisaged  and  each  has  been  separately  tested.  In  one 
the  only  control  of  the  pickup  device’s  position  is  muscular.  Errors  of 
about  1  mm  only  can  be  tolerated.  In  the  other  gross  positioning  of  the 
probe  is  muscular,  but  small  errors  can  be  compensated  for  by  a  movable 
platform  mounted  on  the  hand-held  device  and  separately  controlled  by 
its  own  servomotor  and  photodiodes.  Muscular  errors  of  up  to  a  quarter 
of  an  inch  can  be  tolerated. 


Figure  3  Reading  Machine  Test  Sheet 


Braille  Reader  with  Spelled  Speech  329 

Tests  for  the  muscular  control  alone  (MC)  were  started  by  the  ex¬ 
perimenter  positioning  the  probe  for  the  subject  at  the  left  of  one  of  the 
lines.  The  subject  was  then  required  to  draw  the  probe  from  left  to  right 
along  the  line.  Whenever  an  error  was  registered  by  one  of  the  buzzers 
this  motion  was  stopped  and,  as  a  separate  operation,  the  error  was  eli¬ 
minated  by  moving  the  probe  at  right  angles  to  its  former  motion.  When, 
and  only  when,  the  error  was  removed  the  subject  was  permitted  to  move 
the  probe  once  more  along  the  line  of  ‘print.’  Initial  experiments  were 
conducted  on  a  line-by-line  basis. 

The  procedure  proved  to  be  excellent  for  learning  to  decipher  the  clues 
and  correlate  them  with  the  required  muscular  movement,  i.e.,  automati¬ 
cally  associating  vibrations  from  one  buzzer  with  a  downward  muscular 
movement  and  from  the  other  with  an  upward  movement.  The  subject 
was  soon  able  to  start  himself  at  the  beginning  of  a  line  of  print.  Me¬ 
chanical  stops  were  used  to  limit  the  horizontal  movement. 

Tests  for  the  muscular  and  servo-assisted  positioning  of  the  probe 
(MS)  proceeded  in  much  the  same  way,  but  with  the  following  relaxation 
in  requirements:  the  subject  was  allowed  to  correct  a  vertical  error  while 
proceeding  in  the  left-to-right  motion.  Much  higher  speeds  were  possible, 
but  as  a  learning  situation  this  method  was  inferior  to  that  described  above. 
In  certain  cases  subjects  would  make  more  mistakes  toward  the  end  of  a 
test  than  at  the  beginning,  and  noticeably  more  concentration  was  needed 
to  eliminate  bad  habits  arising  from  occasional  mistakes.  Operators  who 
were  exposed  to  two  or  more  trials  with  single  lines  were  then  required 
to  trace  out  a  complete  page  of  19  lines. 

Experimental  Results 

Some  typical  scores  of  errors  and  speed  on  a  line  basis  are  shown  in  Figures 
4,  5,  6,  and  7.  In  Figure  4  a  particularly  slow  subject  is  shown.  He  was 
started  with  the  MS  method  and  made  a  time  of  about  20  to  25  seconds 
per  line.  His  performance  did  not  seem  to  be  capable  of  improvement. 
With  the  MC  method  the  same  subject  felt  more  confident  of  mastering 
the  procedure  and  attained  a  comfortable  speed  of  9  to  11  seconds  a  line. 
His  best  speed  using  the  MS  method  in  a  subsequent  test  was  6V2  seconds 
a  line  (see  Figure  5).  This  is  equivalent  to  about  46  words  per  minute 
(each  line  contained  five  words) . 

Figures  6  and  7  show  the  performance  of  a  superior  subject  (MB). 
With  the  MS  method  times  of  about  5  seconds  per  line  were  easily  obtained; 
with  the  MC  method  times  ranging  from  7.5  down  to  3.5  seconds  were 


30 


ERRORS 

Figure  4  Probe  Positioning  Tests  of  Slow  Subject  I.  Initial  performance 
with  MS  method. 


ERRORS 

Figure  5  Probe  Positioning  Tests  of  Slow  Subject  II.  Performance  with 
MS  method  following  some  training  with  the  MC  method. 


Braille  Reader  with  Spelled  Speech  331 

obtained  as  the  test  proceeded  (see  Figure  7).  Time  to  follow  a  complete 
page  of  19  lines  was  120  seconds.  As  each  page  contained  an  average  of 
100  words  this  rate  corresponds  to  50  words  per  minute.  This  is  half  the 
maximum  rate  obtained  with  spelled  speech. 

Discussion  of  Positioning  a  Hand-Held  Probe 

The  experiments  have  shown  that  the  versatility  of  the  human  operator 
may  be  used  to  position  quite  accurately  the  pickup  probe  of  a  reading 
machine.  Sighted  subjects  were  used  and  the  performance  with  such  sub¬ 
jects  is  probably  inferior  to  what  may  be  expected  with  the  blind,  since 
they  are  more  sensitive  to  subtle  tactile  and  aural  clues. 

The  MC  method — stopping  whenever  a  mistake  is  registered  and  elim¬ 
inating  it  before  proceeding  with  the  line  scan — produces  an  ideal  leam- 


Figure  6  Probe  Positioning  Tests  with  Normal  Subject  I  (MS  Method) 


332  Man-Machine  Systems 

ing  situation.  The  speeds  obtained  with  this  method  correspond  to  about 
80  words  per  minute,  rather  less  than  the  minimum  speed  (100  words 
per  minute)  aimed  for.  Further  work  is  proceeding  to  ascertain  whether 
this  method  can  be  used  with  highly  trained  blind  subjects  in  order  to 
obtain  higher  speeds.  Considerations  of  simplicity  of  apparatus  are  at¬ 
tractive. 

The  MS  method  imposes  much  less  strain  on  the  operator,  but  learning 
bad  habits  is  relatively  easy.  Learning  such  bad  habits  is  easier  because 
the  method  is  basically  faster  and  the  subject  tries  to  attain  his  maximum 
speed.  In  a  reading  situation  the  reader  is  more  concerned  with  obtaining 
information  than  attaining  high  reading  rates  and  the  temptation  that  gives 
rise  to  such  mistakes  is  removed. 


Figure  7  Probe  Positioning  Tests  with  Normal  Subject  II  (MC  Method) 


Braille  Reader  with  Spelled  Speech 


333 


SPELLED  SPEECH 

Test  material  was  recorded  on  tape  using  the  following  method:  first,  an 
oral  record  was  made  of  spelled  speech  using  a  tape  recorder;  second,  the 
spelled  speech  passage  so  made  was  played  back  at  increased  speed  to 
reduce  the  time.  Listening  tests  made  with  such  material  show  that  such 
reductions  adversely  affect  intelligibility. 

The  time  compression  increases  the  pitch  of  the  speech  so  that  it  re¬ 
sembles  chipmunk  sounds;  and  it  is  this  increase  in  pitch  which,  it  is 
thought,  reduces  intelligibility.  The  pitch  can  be  reduced  by  a  further  op¬ 
eration  on  the  message  by  a  Doppler  compressing  machine  (a  Vari-Vox 
machine)  (see  Figure  8  [4]). 

The  two  operations  produce  the  same  effect  that  could  be  produced 
by  tape  cutting  and  splicing.  Suppose  the  original  tape  is  cut  into  equal 


MAGNETIC  TAPE 


Id=  DISCARD  PERIOD 


Is=  INTERRUPTION  PERIOD 


•fj=  INTERRUPTION  FREQUENCY  =  |/I 

3 


COMPRESSION  RATIO  = 


s 


Figure  8  Definitions  for  Time  Compression  of  Speech 


334 


Man-Machine  Systems 

lengths  and  all  odd-numbered  tape  sliees  are  joined  together  while  even- 
numbered  ones  are  discarded.  Then,  playing  back  this  tape  at  normal 
speed  the  message  will  occupy  only  half  the  time  and  the  pitch  will  appear 
to  be  normal.  Certain  discontinuities  will  be  present  where  the  tape  frag¬ 
ments  join  one  another,  but  such  discontinuities  produce  little  effect  on 
intelligibility. 

Using  the  above  explanation,  some  quantities  must  be  defined.  The 
discard  period  Id  and  the  interruption  period  Is  are  basic  quantities.  An 
interruption  frequency  is  given  by: 

fi  =  1/A 

while  the  compression  ratio  is: 


The  quantities  /*  and  R  are  obtainable  from  settings  on  the  Vari-Vox 
machine.  (The  notation  used  here  is  that  of  Fairbanks  and  Kodman  [4].) 

Experimental  Procedure  and  Material 

Two  groups  of  tests  were  performed.  In  one  complete  sentences  were  used 
as  test  material.  The  purpose  of  such  tests  was  to  ascertain  the  ultimate 
speed  which  could  be  decoded  by  blind  subjects  and  how  long  the  learning 
of  techniques  for  mastering  such  material  would  take.  In  the  other  isolated 
letters  were  used.  Such  material  was  useful  in  analyzing  the  effects  of  com¬ 
pression  on  intelligibility.  In  addition,  some  clues  are  provided  by  these 
tests  as  to  the  main  features  the  operator  uses  to  identify  a  letter. 

The  tests  were  prepared  on  magnetic  tape  and  the  signal  from  this  tape 
was  piped  to  the  headphones  of  the  subjects.  The  subjects  could  adjust 
their  headphones  for  comfortable  listening  volume. 

Results 

Shown  in  Figure  9  are  the  sentence  performance  responses  of  four  blind 
subjects.  The  score  is  plotted  against  speed  of  delivery  in  words  per  minute. 
A  comfortable  rate  of  110  words  per  minute  is  possible  with  82  percent 
correct  responses.  The  results  are  particularly  encouraging  when  it  is  con¬ 
sidered  that  the  subjects  had  not  been  exposed  previously  to  such  material. 
Tests  took  about  one  hour  and  what  learning  there  was  took  place  in  this 
time.  No  particular  attention  was  paid  to  the  interruption  frequency  /*  cor¬ 
responding  to  these  various  rates.  It  is  thought  as  a  result  of  the  letter 
experiments  (below)  that  a  poor  choice  of  /4  is  responsible  for  the  mini¬ 
mum  score  which  occurs  at  an  89  word  per  minute  rate. 


100 


90 


80 

% 

ORE 

70 


60 


50 


40 


30  L- 
50 


J L 

60  70 


J  I  I  I  1  I 

80  90  100  NO  120  130 


WORDS/MINUTE 

Figure  9  Sentence  Recognition  Tests 


336  Man-Machine  Systems 

Some  letter  performance  responses  of  sighted  subjects  are  shown  in 
Figures  10,  11,  and  12.  All  the  tests  were  taken  with  a  compression  ratio 
of  R  =  Vi.  Each  alphabet  letter  was  randomly  presented  and  each  letter 
appeared  ten  times.  The  variable  is  the  interruption  frequency  fi.  Curious 
patterns  emerge  and  these  will  be  discussed  later.  An  immediate  conclu¬ 
sion  is  that  for  maximum  intelligibility  the  interruption  frequency  must 
generally  be  high  (about  130  cps).  Exceptions  are  V,  D,  P,  B,  EG,  and  C. 
Best  results  are  obtained  if  /,  is  adjusted  for  an  optimum  for  each  letter. 

Discussion  of  Letter  Tests 

The  shape  of  the  curves,  Figures  9,  10,  and  11  are  provocative.  Why  the 
dips?  Why  the  shape  in  general?  Some  mechanisms  will  be  described 
which  can  explain  their  shapes  and  give  clues  of  letter  identification. 

For  very  low  /,-,  Miller  says  in  effect  that  the  percent  intelligence  of  a 
message  tends  towards  1  -R  (7).  In  our  experiments  this  quantity  is  50 
percent  and  the  letters  V,  B,  N,  M,  and  Q  are  probably  tending  this  way. 
Curves  for  other  letters  must,  on  this  basis,  have  at  least  one  maximum 
or  minimum  in  the  range  0  </,  <10  c/s. 

Basically  there  appear  to  be  three  curve  shapes:  one  shows  a  mono¬ 
tonic  decrease  as  /,•  is  increased  (e.g.,  the  letter  M  in  Figure  10);  another 
shows  a  small  maximum  and  a  minimum  (e.g.,  letters  V,  G,  B,  etc.,  in 
Figure  11);  another  class  shows  a  maximum  only  (as  in  Figure  12).  The 
last  two  classes  are  probably  manifestations  of  the  same  mechanism:  most 
of  the  letters  end  in  ‘ee.’ 

If  a  letter  consisted  of  two  parts,  a  crucial  one  for  letter  identification 
and  another  not  so  necessary,  then  evidently  as  the  interruption  fre¬ 
quency  ft  is  increased  the  chance  that  the  discard  period  will  correspond 
with  the  crucial  part  will  become  progressively  remote.  In  addition,  the 
crucial  part  must  not  be  periodic  in  nature,  at  least  within  the  same  order 
of  magnitude  of  Is.  This  is  so  because  such  a  periodic  waveform  will  give 
rise  to  the  other  type  of  curve  discussed  below. 

If  letter  identification  is  by  means  of  resonators,  then  we  have  a  means 
of  explaining  the  second  class  of  curve.  For  example,  if  the  crucial  part 
of  the  letter  for  identification  is  a  sine  wave,  then  the  Vari-Vox  output  for 
a  particular  fi  will  consist  of  alternately  following  sine  wave  sections.  These 
enter  the  ear  and  hence  the  hypothetical  resonator.  If  the  sections’  phases 
are  continuously  increasing,  then  the  voltage  in  the  resonator  will  build  up; 
identification  is  supposed  to  take  place  when  the  voltage  reaches  a  certain 
level.  If  on  the  other  hand  the  sine  wave  sections  correspond  to  another  fi, 
phases  can  jump  180  degrees  at  each  discontinuity;  with  such  an  input 


70 

60 

50 

% 

3  OR 

40 

30 

20 

10 

0 


y 

J 


10  20  30  40  50  60  70  130 


INTERRUPTION  FREQUENCY  (c.p.  sj 


10 


Letter  Recognition  Tests.  Graphs  are  generally  monotonically 


338 


Man-Machine  Systems 


% 

ERROR 


10  20  30  40  50  60  70 


130 


INTERRUPTION  FREQUENCY  (c.p.S.) 

Figure  1 1  Letter  Recognition  Tests.  Graphs  show  pronounced  maxima 
and  minima. 


70 

60 

60 

« 

ROR 

40 

30 

20 

10 

0 


Braille  Reader  with  Spelled  Speech 


339 


Letter  Recognition  Tests.  Graphs  show  pronounced  maxima. 


340  Man-Machine  Systems 

the  resonator’s  output  will  rise  and  fall  and  a  large  voltage  will  never  be 
reached.  Obviously,  matching  the  phase  at  the  discontinuities  will  occur 
for  several  values  of  /*.  At  each  value  a  minimum  number  of  errors  will 
occur  and  between  such  minima  the  errors  will  approach  maxima. 


THE  BRAILLE  READER 

The  ‘memory’  of  a  braille  reader  which  will  test  simultaneous  positioning  of 
the  hand-held  probe  and  decoding  spelled  speech  is  shown  in  Figure  13. 


Figure  13  Reproducing  Device  for  Spelled  Speech  Output 


Twenty-six  spelled  speech  letters  are  stored  on  tape  sections  and  these  are 
pulled  past  13  stationary  playback  heads  by  modified  pen  motors.  The  pen 
motors  have  a  delay  of  about  10  milliseconds  in  their  response.  Signals 
from  the  pickup  probe  are  fed  to  a  diode  logic  matrix  which  identifies  the 
letter  being  read  and  with  this  information  gives  the  correct  instruction  to 
the  motors,  and  switches  one  of  the  13  heads  to  the  headphone  amplifier. 

CONCLUSIONS 

1.  A  hand-held  pickup  probe  with  tactile  clues  of  position  seems  to  be 
capable  of  controlling  the  position  adequately  up  to  line  rates  equivalent 
to  86  words  per  minute.  Tests  employed  sighted  subjects  and  training 


Braille  Reader  with  Spelled  Speech  341 

periods  were  limited  to  about  a  half-hour.  A  learning  procedure  associated 
with  the  method  seems  to  be  very  effective. 

2.  A  hand-held  probe  with  additional  servo-operated  control  of  the 
pickup  photodiodes  brings  about  speeds  corresponding  to  peaks  of  about 
150  words  per  minute.  Page  speeds  of  about  50  words  per  minute  were 
obtained. 

3.  Spelled  speech  has  been  shown  easy  to  produce  and  results  obtained 
with  relatively  crude  concepts  are  very  encouraging.  Further  experiments 
have  shown  that  a  tailoring  of  the  interruption  frequency  to  the  particular 
letter  should  lead  to  noticeable  improvement.  In  the  method  advocated  there 
is  no  overlap  of  spelled  speech  letters.  No  attempt  has  been  made  to  produce 
a  quasilanguage  this  way. 

The  experiments  with  single  spelled  speech  letters  provide  two  clues  for 
letter  identification,  one  purely  on  a  time  basis,  the  other  by  means  of 
resonators. 

REFERENCES 

1.  d’Albe,  E.  E.  Fournier,  “The  Optophone:  An  Instrument  for  Reading  by  Ear,” 

Nature,  Vol.  105  (1920),  p.  295. 

2.  Beddoes,  M.  P.,  E.  S.  W.  Belyea,  and  W.  C.  Gibson,  “A  Reading  Machine  for 

the  Blind,”  Nature,  Vol.  190  (3961),  p.  874. 

3.  Coffey,  J.  L.,  H.  E.  Hull,  D.  M.  Metcalfe,  and  L.  J.  Mason,  “The  Further  De¬ 

velopment  of  Aural  Reading  Devices  for  the  Blind,”  in  Battelle  Research  Re¬ 
port,  June,  1961. 

4.  Fairbanks,  G.,  and  F.  Kodman,  “Word  Intelligibility  as  a  Function  of  Time 

Compression,”/.  Aeons.  Soc.  Amer.,  Vol.  29  (1957),  pp.  636-641. 

5.  Freiberger,  H.,  and  E.  F.  Murphy,  “Reading  Machines  for  the  Blind,”  I.R.E. 

Professional  Group  on  Human  Factors  in  Electronics,  March  1961,  pp.  8-19. 

6.  Metfessel,  M.,  and  C.  Lovell,  “Spelled  Speech  as  Output  for  an  Automatic  Reader,” 

in  Research  Report.  Los  Angeles:  University  of  Southern  California,  1961 
(Psychology  Department). 

7.  Miller,  G.  A.,  and  J.  C.  R.  Licklider,  “The  Intelligibility  of  Interrupted  Speech,” 

/.  Acous.  Soc.  A mer.,  Vol.  22  (1950),  pp.  167-173. 


. 


THE  DEVELOPMENT  AND  EVALUATION  OF 


THE  BATTELLE  AURAL  READING  DEVICE 

JOHN  L.  COFFEY 

Battelle  Memorial  Institute,  Columbus,  Ohio 


INTRODUCTION 

The  work  leading  to  the  development  and  evaluation  of  the  Battelle  aural 
reading  device  has  been  sponsored  primarily  by  the  Prosthetic  and  Sensory 
Aids  Service  of  the  Veterans  Administration.  The  work  was  started  in  April, 
1957,  and  has  continued  to  the  present  time.  During  the  calendar  years  of 
1960  and  1961,  additional  work  with  the  device  was  supported  by  a  grant 
from  the  National  Institute  of  Neurological  Diseases  and  Blindness,  National 
Institutes  of  Health. 

This  paper  contains  a  description  of  the  Battelle  aural  reading  device 
for  the  blind  and  a  description  of  an  accessory  feature  for  the  reading  device. 
In  addition,  an  account  is  given  of  training  experience  that  has  evolved  from 
use  of  the  device  by  blind  people.  The  purpose  of  this  paper  is  not  only  to 
describe  the  program  of  research  that  has  been  accomplished  up  to  this  time, 
but  also  to  take  a  realistic  look  at  the  capabilities  of  the  Battelle  device  and 
its  future  role  as  an  aid  in  permitting  at  least  a  segment  of  the  blind  popula¬ 
tion  greater  realization  of  their  potential. 

The  Optophone 

What  is  an  optophone?  The  optophone  (with  a  lower  case  “o”)  is  defined 
in  Webster’s  unabridged  dictionary,  second  edition,  as  “an  instrument  by 
which  light  energy  is  converted  into  sound  energy,  so  that  a  blind  person 
is  enabled  by  its  use  to  locate  and  estimate  varying  degrees  of  light  through 
the  ear  and  thus  even  to  read  printed  matter.”  The  Optophone  (with  a 
capital  “O”)  is  usually  used  to  refer  to  a  print  reading  machine  invented 
by  E.  E.  Fournier  d’Albe  in  1912  (2).  The  original  invention  by  Fournier 
d’Albe  was  the  Exploring  Optophone.  This  device  was,  in  effect,  a  guidance 
device  in  that  it  was  used  to  assist  the  blind  person  in  getting  around  by 
detecting  various  concentrations  of  light  energy.  The  first  print  reading 
Optophone  was  demonstrated  in  England  in  1913. 


343 


344  Man-Machine  Systems 

Since  1913  several  significant  improvements  in  the  Optophone  have 
brought  it  to  its  present  stage  of  development.  At  this  time  it  is  found  in 
only  limited  numbers.  The  only  Optophone  known  by  this  author  to  still  be 
in  actual  operation  is  the  one  used  by  Miss  Mary  Jameson  in  Anerley,  a 
suburb  of  London,  England.  She  was  one  of  the  original  students  used  in 
the  assessment  of  its  potential  value  as  a  reading  device  for  the  blind. 

Before  briefly  reviewing  the  history  of  the  development  of  the  Opto¬ 
phone,  it  would  be  well  to  examine  the  concept  involved  in  its  invention. 
The  Optophone  is  a  representative  of  a  class  of  reading  machines  which 
Cooper  (2)  has  labelled  direct  translating,  nonintegrating  devices.  His  classi¬ 
fication  was  made  on  the  basis  of  the  operation  of  the  machine  in  converting 
the  input  into  the  output.  The  direct  translating,  nonintegrating  classification 
groups  reading  machines,  and  it  expresses  the  principle  upon  which  the 
Optophone  was  built.  Direct  translating  and  nonintegrating  means  essentially 
that  when  printed  material  is  sampled  by  the  reading  device  the  output  of 
the  device  is  directly  and  immediately  determined  by  the  printed  material  as 
a  function  of  both  the  continually  changing  contour  of  the  print  and  the 
point  in  time  at  which  it  is  sampled.  The  direct  translating  part  of  this  classi¬ 
fication  means  that  the  relative  location  of  the  contour  of  the  print  is  indi¬ 
cated  to  the  blind  person  in  the  output.  In  general,  devices  built  up  to  this 
time  have  used  relative  vertical  location  of  the  print  contour;  however, 
vertical  sampling  is  not  the  only  conceivable  use  of  this  feature.  The  non¬ 
integrating  part  of  the  classification,  of  course,  refers  to  the  time  sampling 
procedure  of  the  device.  Devices  that  are  characteristic  of  this  classification 
have  no  integrating-over-time  feature.  That  is,  there  is  an  instant-to-instant 
correspondence  in  time  between  the  input  and  the  output  of  the  device.  The 
configuration  of  printed  contours  representing  the  input  at  a  point  in  time 
determines  the  output  for  that  time  sampling.  As  the  input  changes  from 
instant  to  instant,  as  it  would  in  the  reading  task,  the  output  changes  in  the 
same  temporal  manner.  In  general,  direct  translating,  nonintegrating  de¬ 
vices  have  examined  vertical  slices  of  letters  from  left  to  right  in  temporal 
sequence.  Because  of  this  procedure  they  have  been  designated  as  letter 
reading  machines. 

The  general  impression  exists  that  machines  built  according  to  the 
direct  translating,  nonintegrating  principle  represent  from  an  engineering 
standpoint  about  the  simplest  possible  functional  reading  devices  for  the 
blind.  It  is  also  generally  agreed  that  the  learning  task  involved  in  any 
practical  use  of  the  output  is  extremely  difficult. 

Cooper  adds  to  his  classifications  of  reading  machines  to  include  direct 


The  Battelle  Device 


345 


translating,  integrating  devices,  and  recognition  devices  as  opposed  to  direct 
translating  devices.  Although  these  additional  classifications  are  not  of 
particular  interest  in  this  paper  it  is  possible  to  make  a  fairly  general  state¬ 
ment  about  them.  If  the  direct  translating,  nonintegrating  classification  is 
considered  as  a  starting  point,  any  movement  away  from  this  starting  point 
to  other  classifications  generally  represents  an  increase  in  engineering  com¬ 
plexity  of  the  resulting  devices  and  a  decrease  in  the  difficulty  of  the  learning 
task  required  for  use  of  the  output  by  the  blind  person. 

Returning  briefly  to  the  development  of  the  Optophone,  it  is  now  easier 
to  understand  its  operation.  In  the  original  Optophone  the  input  was,  of 
course,  printed  material.  The  output  selected  was  audible  or  aural,  and  the 
sampling  method  used  was  to  examine  vertical  slices  of  the  letter  from  left 
to  right.  The  original  Optophone  was  a  ‘white  reading’  device;  in  1920  it  was 
improved  and  made  a  ‘black  reading’  instrument.  A  description  of  the  oper¬ 
ation  of  the  black  reading  Optophone  has  been  offered  by  Freiberger  and 
Murphy.  “This  device  (the  white  reading  Optophone)  was  improved,  and 
a  1920  patent  showed  a  device  for  reading  black  letters  by  illuminating  a 
vertical  section  of  the  letter  area  with  light  pulsed  at  five  frequencies  by 
holes  in  five  annular  zones  of  a  rotating  disc.  Unbalance  in  an  electrical 
bridge  having  selenium  cells  in  two  of  its  arms,  one  receiving  part  of  the 
pulsed  outgoing  light,  the  other  the  reflected  light  modulated  by  the  print  on 
the  paper,  was  used  to  feed  to  the  earpiece  an  audio  signal  wherein  a  low 
tone  corresponded  to  black  at  the  bottom  of  the  area  scanned,  with  pro¬ 
gressively  higher  tones  for  black  in  the  upper  parts  of  the  area.  A  mechanical 
mounting  moved  the  optical  system  smoothly  along  the  line  of  type  but 
allowed  the  user  to  adjust  the  horizontal  scanning  rate,  to  slow  the  scanning 
momentarily,  or  to  retrace  it  if  necessary”  (3 ) . 

In  the  same  publication  Freiberger  and  Murphy  pointed  out  later 
modifications  to  the  black  reading  Optophone.  They  stated  that  after  World 
War  II  a  few  of  the  machines  were  modified  for  six  channels,  compared  to 
the  original  five,  and  newer  photocells  were  used  in  place  of  the  early 
selenium  units. 

Some  Other  Direct  Translating,  Nonintegrating  Devices 

Subsequent  to  the  invention  of  the  Optophone  other  direct  translating,  non¬ 
integrating  devices  have  appeared.  Most  notable  of  these  were  the  Visagraph 
invented  by  Robert  E.  Naumberg  and  described  in  1928,  and  the  Radio 
Corporation  of  America  (RCA)  Type  A-2  machine  developed  by  RCA 
Laboratories  in  1946. 


346  Man-Machine  Systems 

The  Visagraph  is  a  direct  translating,  nonintegrating  device  employing 
a  tactile  output.  Essentially,  this  machine  makes  an  enlarged  and  raised 
replica  of  the  printed  material  by  embossing  aluminum  foil.  The  machine 
was  improved  in  1947  and  made  a  “Faximile”  Visagraph  under  sponsorship 
of  the  Committee  on  Sensory  Devices  (3). 

The  RCA  Type  A-2  reader  represented  another  fairly  extensive  attempt 
to  build  a  direct  translating,  nonintegrating  reading  device  for  the  blind. 
This  device,  like  the  optophone,  employed  an  audible  output.  The  details 
of  its  operation  were  somewhat  different  from  the  Optophone  in  that  a 
vertical  sweeping  spot  of  light  and  a  variable  frequency  oscillator  were  used. 
The  vertical  sample  of  the  letter  was  swept  from  bottom  to  top  by  the  light 
spot,  and  as  the  light  encountered  contours  of  letters  the  variable  frequency 
oscillator  was  turned  on  by  a  photosensitive  element.  The  frequency  of  the 
output  was  determined  by  the  position  of  the  sweeping  light  spot  when  it 
encountered  the  letter  contour.  The  repetition  rate  of  the  vertical  sweeping 
spot  of  light  defined  the  sweep  frequency  of  the  reader. 

An  evaluation  of  the  RCA  Type  A-2  reader  with  blind  subjects  was 
conducted.  The  results  of  this  evaluation  have  been  stated  by  Freiberger 
and  Murphy:  “The  mixed  conclusion  of  this  study,  neither  abandoning 
direct-translation  devices  nor  suggesting  immediately  contracting  for  a 
supply  for  all  blind  persons,  indicated  further  development  work  was 
needed'’  (3). 

Other  attempts  to  build  devices  similar  in  operational  principle  to  those 
discussed  above  have  been  made  with  varying  degrees  of  success.  These 
devices  represent  an  interesting  array  of  efforts  to  bring  the  printed  word 
directly  to  the  blind  without  special  intervening  preparation.  It  is  not  within 
the  scope  of  this  paper  to  discuss  all  of  these  attempts,  but  a  limited  look 
at  historical  events  has  been  useful  for  providing  the  proper  setting  for  dis¬ 
cussing  the  work  represented  by  the  development  and  evaluation  of  the 
Battelle  device. 

THE  BATTELLE  AURAL  READING  DEVICE 

The  research  devoted  to  the  development  and  evaluation  of  the  Battelle 
aural  reading  device  has  been  directed  toward  three  major  areas.  The  first 
area  has  been  reading  device  development,  the  second  has  been  develop¬ 
ment  of  training  procedures  for  evaluating  the  device,  and  the  third  area 
has  been  the  development  of  mechanical  tracking  accessories.  These  three 
areas  represent  the  general  framework  of  the  total  research  program. 


The  Battelle  Device 


347 


The  Reader 

The  Battelle  reader  is  a  direct  translating,  nonintegrating  device  employing 
a  vertical  sampling  of  a  printed  letter  from  its  left  side  to  its  right.  The  out¬ 
put  of  the  device  is  audible  or  aural;  it  is  an  optophone  type  of  output  in  that 
it  employs  discrete  tones  of  different  frequencies  to  indicate  the  relative 
position  of  print  contours  being  sampled  by  the  reading  probe.  The  details 
of  operation  of  the  device  are,  however,  somewhat  different  from  the 
original  Optophone  as  modified  by  Bar,  Stroud,  and  Fournier  d’Albe. 
These  differences  will  become  apparent  in  the  discussion  of  the  development 
of  the  device.  A  Battelle  reader  is  shown  in  Figure  1. 

The  Battelle  reader  is  made  up  of  two  primary  units.  The  first  unit  is 
the  reading  probe  which  consists  of  rollers  for  moving  the  probe  and  a 
housing  containing  lamps,  lens,  a  linear  array  of  photocells,  and  a  mechan¬ 
ism  for  adjusting  the  probe  to  different  sizes  of  print.  The  second  unit  con¬ 
tains  the  electronics  of  the  device  from  which  the  headphones  extend. 
The  headphones  transmit  the  audible  signals  to  the  user’s  ears.  The  elec¬ 
tronics  include  controls,  oscillators,  switching  circuits,  a  mixer,  and  an 
amplifier. 

Input  and  Output.  The  input  to  the  Battelle  device  is  printed  material. 
The  reading  probe,  which  may  be  hand-held  or  inserted  into  a  mechanical 
tracker,  is  moved  along  a  printed  line  from  left  to  right.  As  the  probe  is 
moved  on  its  rollers  over  a  letter  the  letter  is  illuminated  by  two  small  lamps 
mounted  on  either  side  of  a  lens.  The  image  of  the  letter  is  focused  and 
transmitted  by  the  lens  up  into  the  cylinder  of  the  probe.  A  narrow  vertical 
section  of  the  letter  falls  on  a  linear  array  of  nine  photocells.  The  focus  of 
the  letter  image  is  so  adjusted  that  for  the  letter  with  the  greatest  range, 
lower  case  j,  the  focused  image  never  exceeds  the  length  of  the  nine  photo¬ 
cells.  In  other  words,  the  bottom  of  the  j  is  projected  on  one  of  the  end 
photocells  and  the  dot  of  the  j  is  focused  on  the  photocell  at  the  opposite 
end  of  the  array.  Of  course,  the  rest  of  the  j  falls  on  the  intervening  photo¬ 
cells.  As  the  probe  is  moved  from  the  left  side  of  any  letter  to  the  right  the 
pattern  of  contours  falling  on  the  photocell  array  changes  in  the  same 
temporal  manner. 

Part  of  the  electronic  components  of  the  device  is  comprised  of  nine 
transistorized  oscillators.  Each  photocell  in  the  array  controls  an  oscillator 
by  means  of  a  transistorized  switch.  The  nine  oscillators  are  tuned  from 
440  to  2795  cps  with  an  equal  logarithmic  separation  between  each  fre¬ 
quency.  When  one-third  or  more  of  a  photocell  in  the  array  is  covered  by 


348 


Man-Machine  Systems 


Figure  1  The  Prototype  Reader 

black,  the  oscillator  controlled  by  that  photocell  is  turned  on.  With  less  than 
one-third  black  coverage  or  all  white  coverage  the  oscillator  is  not  turned  on; 
it  is  turned  off  if  it  is  already  functioning.  The  photocells  in  the  array  and 
their  corresponding  oscillators  are  so  arranged  that  when  the  low  part  of  a 
letter  is  sampled  (such  as  descenders  in  p,  y,  g,  etc.)  the  lower  tones  are 
heard  by  the  user.  If  the  part  of  the  letter  being  sampled  is  relatively  higher 


The  Battelle  Device  349 

relatively  higher  tones  are  heard.  The  highest  tone  is  heard  at  the  top  of 
ascenders  such  as  d,  f,  etc. 

By  moving  the  probe  from  the  left  side  of  a  letter  to  the  right  the 
pattern  of  contours  of  the  letter  on  the  photocells  changes,  and  the  user 
hears  a  tone  pattern  that  is  characteristic  of  the  letter.  By  the  same  pro¬ 
cedure,  if  the  reading  probe  is  moved  from  left  to  right  over  an  entire  word, 
the  user  hears  a  series  of  tone  patterns  characteristic  of  that  word.  By 
learning  tone  patterns  for  letters  or  a  series  of  tone  patterns  for  w7ords  the 
user  is  able  to  read  printed  matter.  The  most  effective  method  of  instruction 
found  up  to  this  time  has  been  a  combination  of  these  procedures. 

Model  A  Readers.  The  Battelle  device  has  gone  through  three  modifica¬ 
tions  of  enough  importance  to  warrant  the  specification  of  four  models.  The 
entire  developmental  program  of  the  reader  is  an  interesting  account; 
however,  it  is  not  possible  to  include  the  entire  history  here.  It  would  be 
appropriate  to  the  point  out  the  changes  that  led  from  one  model  to  another. 
The  first  five  readers  built  have  been  designated  as  Model  A  readers.  These 
readers  had  three  controls  for  use  by  the  blind  person.  One  control  was  an 
on/off  control,  the  second  controlled  the  volume  of  the  output,  and  the  third 
was  for  controlling  a  rheostat  that  adjusted  the  intensity  level  of  the  lamps 
in  the  probe.  The  Model  A  readers  also  had  an  array  of  1 1  photocells  that 
were  assembled  from  individual  photocells  commercially  available.  The 
range  of  frequencies  in  the  output  was  from  400  to  4000  cps  with  equal  log 
separation. 

Model  B  Readers.  Experience  with  the  Model  A  readers  indicated  the 
necessity  for  some  changes  and  the  desirability  of  others.  Some  of  the 
changes  were :  ( 1 )  an  improved  cable  between  the  probe  and  the  electronic 
components,  (2)  an  improved  filtering  of  the  ac  power  supply,  (3)  the 
use  of  matched  photocells  and  encapsulating  the  entire  array,  (4)  detailed 
modifications  to  the  audio  oscillators  and  amplifier,  (5)  improved  rollers  on 
the  reading  end  of  the  probe,  (6)  combining  the  on/off  and  volume  controls, 
(7)  a  slight  redesigning  of  the  probe,  and  (8)  a  more  efficiently  arranged 
packaging  of  the  device.  Three  readers  incorporating  these  improvements 
were  constructed.  These  readers  have  been  designated  as  Model  B  readers. 
The  reader  shown  in  Figure  1  is  a  Model  B  reader. 

Model  C  Reader.  One  Model  C  reader  was  constructed  for  two  purposes. 
The  first  purpose  was  to  include  additional  improvements  that  were  felt  to 
be  advantageous  from  experience  with  the  Model  A  and  Model  B  readers. 


350  Man-Machine  Systems 

The  second  purpose  was  to  build  a  reader  with  some  flexibility  features  for 
experimental  purposes. 

The  major  functional  improvements  that  were  incorporated  into  the 
Model  C  reader  were  in  the  reading  probe.  The  reading  probe  was  com¬ 
pletely  redesigned.  The  shape  of  the  reading  probe  head  was  changed  to  a 
rectangular  configuration  to  assist  the  user  in  manual  tracking.  This  shape 
also  simplified  the  mechanical  tracking  mount  necessary  for  the  probe.  A 
new  photocell  array  was  also  designed  for  simpler  control  requirements 
which  in  turn  required  a  change  in  the  internal  configuration  of  the  probe. 
An  improved  method  of  size-of-type  adjustment  was  also  designed  and  in¬ 
cluded.  Additional  lamps  were  added  in  the  probe  head  and  their  location 
was  somewhat  altered. 

The  work  involving  the  redesign  of  the  photocell  array  represented  an 
important  improvement.  The  cooperation  of  the  Clairex  Corporation  in  New 
York  City  was  very  important  in  this  effort.  Prior  to  June  1959,  the  usual 
method  of  obtaining  an  array  was  to  match  and  then  assemble  individual 
photocells.  Prior  to  the  construction  of  the  Model  C  reader,  the  Clairex 
Corporation  developed  a  process  by  which  photoconductive  material  could 
be  deposited  on  ceramic  substrates  by  evaporation  techniques.  A  few  of  the 
older  type  of  arrays  were  constructed  by  this  procedure  and  proved  to  be 
superior  to  the  arrays  in  which  individual  photocells  had  been  matched  and 
then  assembled. 

The  evaporation  (or  vacuum  deposition)  technique  forms  the  entire 
array  of  cadmium  selenide  on  a  preformed  ceramic  substrate.  The  superior¬ 
ity  of  this  method  of  construction  for  the  reading  machine  is  that  the  tech¬ 
nique  results  in  an  entire  array  that  is  essentially  one  photocell  divided 
into  1 1  parts.  The  individual  cells  are  formed  by  machining  away  small 
amounts  of  the  photoconductive  material  to  form  separations  between  the 
individual  photocells.  Therefore,  all  photocells  in  the  array  have  about  the 
same  original  characteristics,  which  decreases  to  some  extent  the  major 
problem  of  differential  drifting  of  these  characteristics  with  use  and  age. 
The  differential  drifting  of  photocells  is  a  maintenance  problem  in  that 
optimum  performance  of  the  reader  occurs  only  when  all  photocells  are 
equally  sensitive.  The  evaporation  technique,  although  not  eliminating  the 
differential  sensitivity  problem,  was  an  improvement  over  the  original  pro¬ 
cedure  of  selecting  and  assembling  individual  photocells. 

In  the  Model  C  reader  the  evaporation  technique  and  an  array  of  dif¬ 
ferent  design  were  the  major  advances.  By  fabricating  an  array  to  include 
a  standard  photocell,  the  manual  control  of  the  rheostat  which  regulated  the 


The  Battelle  Device 


351 


intensity  of  the  lamps  in  the  reading  probe  was  eliminated.  In  the  earlier 
readers  the  light  intensity  was  adjusted  manually  to  hold  the  photocells  at 
the  resistance  obtained  when  they  were  “looking  at”  white  or  the  “no-signal” 
condition.  In  the  Model  C  reader  the  manual  light  adjustment  was  replaced 
by  an  automatic  control.  The  automatic  control  was  obtained  by  providing 
the  standard  photocell  in  the  array.  The  illumination  intensity  is  auto¬ 
matically  adjusted  to  keep  the  resistance  of  the  standard  photocell  constant. 
The  standard  photocell  is  placed  so  that  it  never  samples  printed  material 
when  the  probe  is  in  proper  tracking  position.  The  elimination  of  the  manual 
light  control  removed  a  large  source  of  inconvenience  for  the  blind  user. 

The  addition  of  extra  lamps  in  the  reading  probe  was  another  attempt 
to  decrease  the  sensitivity  problem  of  the  differential  drifting  of  the  photo¬ 
cells.  The  adjustment  of  the  sensitivity  of  the  photocells,  once  properly  set, 
would  also  need  changing  as  the  lamps  aged  and  the  resulting  light  pattern 
on  the  printed  page  changed.  By  using  improved  lamps,  and  more  of  them, 
an  attempt  was  made  to  keep  the  light  pattern  falling  on  the  printed  page 
constant. 

The  changes  included  in  the  Model  C  reader  designed  for  experimental 
flexibility  were  variation  in  tuning  of  each  oscillator  and  individual  volume 
controls  for  each  of  the  1 1  sound  channels.  Also  included  as  improvements, 
and  not  necessarily  experimental  in  nature,  were  improved  switching  cir¬ 
cuitry  and  an  improved  amplifier. 

Each  oscillator  in  the  Model  C  reader  is  tunable  over  a  frequency 
range  of  plus  or  minus  IV2  percent.  This  feature  allowed  for  the  study  of 
some  variation  in  the  code  output.  The  individual  volume  controls  allowed 
for  adjustment  for  subjective  equality  of  loudness  of  the  individual  tones 
for  individual  users.  This  feature  also  allowed  for  the  study  of  cuing  effects 
by  making  certain  critical  tones  louder  than  the  rest  of  the  auditory  dis¬ 
play. 

Work  with  previous  models  and  the  Model  C  reader  resulted  in  some 
slight  changes  in  the  auditory  display.  The  original  display  consisted  of  1 1 
tones,  equally  spaced  logarithmically,  in  the  range  of  400  to  4000  cps.  The 
first  and  eleventh  photocells  were  used  primarily  as  tracking  aids.  When  a 
user  began  tracking  the  line  of  print  too  high  he  would  hear  the  eleventh  or 
top  tone  which  indicated  that  he  was  beginning  to  track  into  the  line  above 
the  desired  one.  If  he  heard  the  first  or  lowest  tone  this  indicated  he  was 
tracking  the  line  of  print  too  low  and  was  beginning  to  pick  up  the  top  of  the 
line  below  the  desired  one.  The  addition  of  mechanical  tracking  eliminated, 
to  a  great  extent,  the  need  for  these  two  channels. 


352  Man-Machine  Systems 

An  improved  standard  code  which  combined  equal  log  separation  of 
the  tones  and  musical  intervals  was  tried  and  found  to  be  at  least  as 
effective  as  the  original  standard  code  and  somewhat  more  pleasing  to  the 
user.  The  advantages  of  the  improved  standard  are:  (1)  it  retains  equal 
log  separation,  (2)  the  highest  tone  is  lower,  making  it  more  acceptable 
subjectively,  (3)  the  aesthetic  effect  of  musical  intervals  is  pleasing,  and 
(4)  the  familiarity  of  the  musical  interval  is  an  aid  in  operating  the  size-of- 
type  adjustment. 

The  improved  code  resulting  from  this  work  has  9  channels  containing 
equal  log  separation  and  a  frequency  range  of  440  to  2795  cps. 

Concurrent  with  the  development  of  the  Model  C  reader  a  detailed 
study  to  determine  the  optimum  code  for  a  direct  translating,  nonintegrat¬ 
ing  device  was  conducted.  This  study  was  too  detailed  to  be  considered 
here;  however,  the  results  indicated  that  of  all  the  codes  screened  as  being 
promising  for  study  the  improved  standard  already  described  above  was 
superior.  This  work  was  conducted  to  be  as  meaningful  as  possible  for 
comparison  with  previous  research  in  this  area,  particularly  that  done  at 
Haskins  Laboratory  during  1944  to  1947  and  sponsored  by  the  Com¬ 
mittee  on  Sensory  Devices  (1).  In  general,  the  Battelle  study  confirmed 
the  results  of  the  code  work  at  Haskins  Laboratories. 

Model  D  Readers.  The  Model  D  readers,  although  not  constructed  at  the 
time  of  this  writing,*  represent  the  best  features  of  all  models  so  far  con¬ 
structed  from  both  the  producer’s  and  the  user’s  point  of  view.  These 
models  will  not  include  the  experimental  features  incorporated  in  the  Model 
C  reader,  but  will  include  a  greatly  improved  size-of-type  adjustment.  The 
major  consideration  in  the  construction  of  these  readers  has  been  to  com¬ 
bine  features  most  desirable  for  the  user  with  those  principles  most  likely 
to  result  in  a  useful,  practical,  and  reliable  device.  It  is  anticipated  that 
ten  units  of  the  Model  D  reader  will  be  constructed  by  the  end  of  June  1962. 

The  Model  D  reader  will  have  an  output  of  9  tones  in  the  range  of  440 
to  2795  cps  with  equal  log  separation.  The  probe  will  have  two  improved 
lamps  rather  than  four  as  was  the  case  with  the  Model  C  reader. 

Mechanical  Tracking 

From  the  beginning  of  the  research  program  it  was  realized  that  some 
device  for  moving  mechanically  the  reading  probe  along  the  line  of  print 
would  be  of  considerable  assistance  to  the  blind  user  of  the  reader.  It 


*  Early  in  1962. 


The  Battelle  Device 


353 


was  also  realized,  however,  that  the  more  accessories  required  for  the 
reader,  the  less  its  flexibility  for  use  and  the  greater  its  cost  to  the  user. 

Early  in  the  research  program  it  was  decided  to  attempt  to  teach  the 
use  of  the  device  with  the  user  tracking  the  probe  along  the  printed  line 
manually.  It  was  hoped  that  the  task  of  tracking  in  a  straight  line  would 
be  a  motor  skill  learned  rather  rapidly  by  the  user.  In  addition,  the  two 
rollers  mounted  on  the  reading  probe  were  quite  long,  and  tended  to  re¬ 
sist  any  other  than  straight-line  movement  once  the  probe  was  properly 
started  on  a  line  of  print. 

Experience  with  the  readers  indicated  that  the  manual  tracking  task 
was  formidable  for  the  blind  user.  From  observation  of  the  users  it  was 
possible  to  infer  two  major  problems  and  to  hypothesize  a  third.  The  first 
problem  was  that  of  tracking  a  straight  line  within  the  limited  tolerances 
imposed  by  the  size  of  a  printed  line  of  type.  Although  users  could  read 
the  code  even  if  it  was  shifted  upward  or  downward  by  two  or  three  tones, 
it  was  difficult  for  them  to  maintain  even  those  tracking  limits.  The  original 
idea  of  a  linear  array  was  aimed  primarily  at  compensating  for  this  type 
of  error.  The  second  problem  was  the  difficulty  in  keeping  the  reading 
probe  at  a  90-degree  angle  to  the  printed  line.  Any  appreciable  rotation 
of  the  reading  prove  distorts  the  output  seriously.  The  hypothesized  prob¬ 
lem  was  that  the  reading  task  presented  to  the  blind  users  was  simply  too 
much  for  them  to  learn  all  at  once.  In  the  reading  process  there  was  the 
tracking  task  to  learn,  the  probe-line  angle  task  to  learn,  and  a  difficult 
code  to  learn.  Although  students  were  able  to  master  all  of  these  tasks 
at  the  same  time,  with  the  result  of  very  slow  reading  rates,  it  was  be¬ 
lieved  that  much  of  the  difficulty  could  be  removed  with  an  adequate 
mechanical  tracking  device,  even  though  such  a  device  would  limit  some¬ 
what  the  flexibility  of  the  machine.  It  was  believed  that  the  performance 
of  even  the  best  students  could  be  facilitated  considerably  with  mechanical 
tracking. 

Early  attempts  to  develop  a  mechanical  tracker  resulted  in  devices 
which  lacked  the  necessary  precision;  later  attempts  resulted  in  a  simple 
device  that  permitted  the  testing  of  the  mechanical  tracking  principle  on 
a  fairly  extensive  basis.  Earlier  and  more  limited  experiments  had  indi¬ 
cated  that  it  would  be  a  worthwhile  accessory  for  increasing  reading  speeds. 

The  simple  mechanical  tracking  devices  consisted  of  a  board  for  holding 
the  printed  material  to  be  read  and  a  small  carriage  for  holding  the  reading 
probe.  The  carriage  was  mounted  on  a  metal  crossbar  which  could  be 
moved  from  the  top  of  the  board  to  the  bottom.  The  carriage  could  be 


354  Man-Machine  Systems 

moved  from  left  to  right  on  the  metal  crossbar.  This  arrangement  allowed 
free  motion  of  the  reading  probe  in  a  plane;  therefore,  a  printed  line  could 
be  read  from  left  to  right  with  accuracy  well  within  the  limits  necessary 
for  reading.  The  carriage  of  the  tracker  also  held  the  probe  at  the  proper 
probe-line  angle.  The  carriage  was  moved  manually  along  the  metal 
crossbar. 

Two  of  the  simple  tracking  devices  were  improved  by  power  driving 
the  carriage  along  the  crossbar  at  a  speed  selected  by  the  user  with  a  spring 
powered,  hydraulically  damped  system.  One  of  the  more  complex  tracking 
devices  is  shown  in  Figure  2. 


Figure  2  Complex  Mechanical  Tracking  Device 


The  major  advantage  of  the  complex  tracker  was  that  it  provided  more 
uniform  motion  for  the  probe.  The  travel  of  the  probe  could  be  slowed 
down  manually,  or  a  part  of  the  line  could  be  retraced  by  the  user.  In 
general,  however,  the  more  complex  tracker  served  to  pace  the  reader  and 
generally  higher  reading  rates  were  demonstrated  with  its  use. 

The  mechanical  tracking  devices,  although  only  experimental  models, 
contributed  greatly  to  the  attempt  to  decrease  the  difficulty  of  the  reading 


The  Battelle  Device 


355 


task  and  to  increase  reading  speeds.  The  incorporation  of  mechanical 
tracking  was  one  of  the  two  large  forward  steps,  exclusive  of  reading  ma¬ 
chine  development,  taken  in  the  research  program.  The  other  significant 
step  involved  training  procedures. 

T raining  Procedures 

Early  in  the  research  program  the  first  of  a  series  of  training  programs 
was  initiated.  The  history  of  development  of  training  procedures  for  the 
Battelle  device  is  quite  long  and  includes  many  changes;  only  a  summary 
can  be  given  here. 

The  majority  of  organized  training  activity  has  been  conducted  at 
the  Ohio  State  School  for  the  Blind  in  Columbus.  Other  limited  training 
has  been  given  at  the  homes  of  several  blind  persons. 

The  first  three  years  of  training  emphasized  the  letter  as  the  unit  of 
instruction.  During  that  time  primary  emphasis  was  put  upon  learning 
the  auditory  coded  representations  of  the  lower-case  alphabet,  the  upper¬ 
case  alphabet,  and  the  nine  numerals.  Students  were  trained  to  certain 
criterion  scores  on  alphabet  tests  and  word  tests  made  up  of  letters  they 
had  already  learned.  Once  they  had  passed  the  criterion  tests  they  were 
introduced  to  the  more  general  reading  task.  The  criterion  tests  were  never 
100  percent  correct  identifications,  because  certain  ambiguities  exist  among 
some  letters  when  heard  in  the  reading  machine  code.  In  the  lower-case 
alphabet  such  letters  as  m,  w,  v,  g,  y,  d,  and  j  are  fairly  easy  for  students 
to  learn.  Letters  such  as  h,  z,  i,  c,  and  e  are  extremely  difficult  because  of 
their  similarity  to  other  letters  when  represented  by  the  aural  code.  For 
example,  h  is  commonly  confused  with  b  or  k,  z  is  confused  with  x  or  s,  i 
with  t  or  1,  c  with  e  or  o,  and  e  with  s  or  c.  In  many  cases  it  is  almost  im¬ 
possible  to  make  absolute  identifications  of  some  of  these  letters  and  only 
in  the  context  of  the  word  or  sentence  can  they  be  identified  positively. 

The  philosophy  of  the  letter  training  procedures  was  that  if  students 
could  learn  to  identify  the  majority  of  printed  English  symbols  absolutely 
they  would  then  be  able  to  identify  any  word.  Those  symbols  they  could 
not  learn  absolutely  could  be  determined  by  contextual  cues.  It  is  possible 
that  the  slower  reading  speeds  attained  with  the  letter  method  was  due  to 
the  training  procedures  necessary  to  attain  a  high-level  accuracy  of  identifi¬ 
cation  of  the  letters  and  numerals.  The  students,  from  the  beginning,  were 
given  training  emphasizing  letters;  they  were  also  reinforced  for  their 
ability  to  tell  the  differences  among  letters.  In  other  words,  the  majority 
of  the  training  procedures  emphasized  letters;  therefore,  their  emphasis  in 


356  Man-Machine  Systems 

reading  was  also  on  letters  or  on  identifying  all  letters  in  a  word.  It  may 
be  that  the  training  procedures  reinforced  behavior  that  was  not  particularly 
desired  in  the  reading  situation. 

In  general,  these  procedures  produced  average  reading  rates  of  about 
3  to  4  words  per  minute  after  65  to  70  hours  of  training.  The  reading  rates 
were  measured  on  third-  or  fourth-grade  level  basic  readers.  With  an  ad- 
ditional  100  hours  of  training  some  of  the  better  students  were  reading  at 
rates  of  from  6  to  10  words  per  minute.  The  level  of  the  test  material  was 
also  somewhat  higher.  All  of  these  results  were  based  upon  manual  track¬ 
ing  by  the  subjects.  Undoubtedly  mechanical  tracking  would  have  increased 
these  rates  considerably. 

During  the  fourth  year  of  training  an  experimental  comparison  was 
made  between  letters  as  the  unit  of  instruction  and  words  as  the  unit  of 
instruction.  Also  compared  were  massed  training  as  compared  to  spaced 
training,  and  the  performance  of  children  in  the  9-  to  1 1-year  old  bracket 
with  the  performance  of  older  students.  The  effects  of  mechanical  tracking 
were  also  studied. 

When  mechanical  tracking  is  added  to  conditions  of  letter  training 
similar  to  those  used  in  the  first  three  years  of  instruction  average  perform¬ 
ance  is  increased  by  about  a  factor  of  two.  Word  training,  combined  with 
mechanical  tracking,  increases  average  performance  by  about  a  factor  of 
four  over  that  found  in  training  similar  to  that  used  in  the  first  three  years 
of  evaluation.  Another  general  statement  that  can  be  made  on  the  basis  of 
the  fourth  year  evaluation  program  is  that  children  of  the  9-  to  11-year  old 
age  group  do  not  do  as  well  as  older  students  in  learning  use  of  the  device 
under  any  of  the  experimental  conditions.  Although  the  children  per¬ 
formed  fairly  well  with  the  reader,  it  appears  that  an  age  of  9  to  11  years 
is  too  young  for  most  successful  work.  On  the  basis  of  the  fourth  year 
evaluation  program,  it  appears  that  students  should  be  at  least  13  years 
old  before  serious  training  with  the  reader  is  started.  Another  statement 
that  can  be  made  is  that  the  effects  of  massed  and  spaced  training,  as  used 
in  the  evaluation  program,  did  not  seem  to  have  any  appreciable  effect  on 
performance.  For  this  variable  three  hours  of  training  per  week  was  com¬ 
pared  with  six  hours  of  training  per  week. 

The  general  results  of  the  fourth  year  of  evaluation  have  been  cited 
above.  The  entire  evaluation  program  was  an  ambitious  effort,  and  the 
results  have  been  reported  in  detail  elsewhere  ( 1 ) . 

The  present  training  program  is  based  primarily  on  information  gained 
during  the  fourth  year  of  evaluation.  Of  course,  all  previous  experience 


The  Battelle  Device 


357 


with  the  device  was  also  of  considerable  assistance  in  the  program’s  con¬ 
struction.  The  program  emphasizes  the  word  as  the  unit  of  instruction, 
but  does  not  preclude  the  use  of  letter  training  where  it  is  considered  the 
best  method  of  making  a  point  or  facilitating  training.  The  program  is 
aimed  at  several  objectives.  It  is  first  concerned  with  teaching  how  the 
device  should  be  used.  Second,  it  is  concerned  with  building  a  useful 
vocabulary  of  words  in  the  reading  machine  code,  and  development  of  the 
ability  to  generalize  the  constructed  vocabulary  to  many  other  words 
not  specifically  taught.  Third,  the  program  is  aimed  at  teaching  many 
of  the  practical  complications  in  reading  such  as  italics,  underlining, 
various  punctuation  marks,  capital  letters,  various  type  faces,  numerals,  and 
so  forth.  Toward  the  end  of  the  program  the  subjects  are  given  some 
formal  letter  training  to  augment  the  letter  symbols  incidentally  learned  in 
order  that  they  will  be  able  to  determine  words  not  determinable  in  any 
other  manner. 

Throughout  this  training  program  emphasis  is  placed  on  tracking  words 
as  units  and  on  moving  as  rapidly  and  steadily  as  possible.  The  use  of 
context  in  the  reading  situation  with  a  minimum  of  retracking  is  encouraged. 
The  specific  goals  of  the  program  are  to  teach  practical  reading  procedures 
and  to  bring  average  reading  performance  to  a  level  of  from  15  to  20 
words  per  minute  on  approximately  eighth  grade  level  reading  material  in 
200  lessons  of  about  one  hour  each. 

Some  other  factors  should  be  mentioned  in  connection  with  the  ex¬ 
perience  of  the  Battelle  evaluations.  The  best  reading  rate  recorded  with 
the  reading  device  was  37  words  per  minute,  using  the  spring  powered, 
hydraulically  damped  tracking  device.  This  speed  is  almost  double  the 
speed  of  19  words  per  minute  which  the  same  user  was  able  to  achieve 
using  manual  tracking.  He  has  been  in  the  program  since  the  beginning 
of  the  evaluation  phases.  Although  his  training  was  concluded  at  the  end 
of  the  third  year  of  evaluation,  after  having  received  about  180  hours  of 
formal  training,  he  has  been  permitted  rather  free  access  to  a  reading 
device  for  his  own  use.  One  other  user  is  equally  skilled;  his  top  recorded 
speed  under  the  same  conditions  was  23  words  per  minute.  However,  the 
second  user  has  spent  far  less  of  his  own  time  with  the  reader  than  the 
first. 

Other  research  conducted  along  with  that  of  the  development  of  train¬ 
ing  procedures  has  been  directed  to  the  study  of  selection  techniques.  Work 
has  progressed  toward  developing  simple  tests  involving  the  reader  with  a 
certain  level  of  predictive  power  in  selecting  students  for  training  with 


358  Man-Machine  Sy steins 

the  device.  These  tests  combined  with  subjective  information  that  may  be 
used  to  assist  in  selecting  students  should  provide  a  guide  for  selecting  those 
with  the  best  chances  for  success  with  the  device. 

There  are  a  number  of  training  procedures  using  the  Battelle  device 
that  have  not  been  explored.  A  great  deal  of  judgment  has  had  to  be  em¬ 
ployed  in  selecting  procedures  because  of  the  length  of  training  required 
before  the  effects  of  the  manipulation  of  any  one  variable  could  be  meas¬ 
ured.  It  would  have  been  interesting  to  test  other  approaches,  but  the  time 
required  has  not  been  available.  The  objectives  of  the  present  training 
program  are  viewed  as  realistic,  but  do  not  represent  the  final  word  as 
to  what  is  best.  With  additional  experience  it  is  entirely  possible  that  more 
rapid  reading  rates  will  be  achieved. 

Future  Plans 

Plans  for  the  future  include  an  evaluation  of  the  Battelle  Model  D  readers 
and  an  evaluation  of  the  training  program  by  an  independent  organization. 
These  evaluation  programs  are  desired  by  both  Battelle  and  the  Veterans 
Administration  to  establish  data  that  are  independent  of  the  original  or¬ 
ganization  that  conducted  the  development  and  evaluation. 

The  combined  results  of  independent  evaluations  and  those  of  Battelle 
will  be  major  factors  in  any  decision  by  the  Veterans  Administration  re¬ 
garding  the  future  of  the  reader.  Battelle  is  confident  that  with  the  improved 
Model  D  readers,  and  with  close  adherence  to  the  completed  training 
program,  the  results  already  obtained  will  be  equaled  and,  hopefully, 
surpassed. 

THE  ROLE  OF  THE  BATTELLE  READER 

In  general,  direct  translating,  nonintegrating  devices,  such  as  the  Opto¬ 
phone,  the  RCA  Type  A-2  reader,  and  the  Battelle  aural  reading  device, 
have  caused  concern  among  those  working  with  reading  machines  for  the 
blind.  There  have  been  two  primary  reasons  for  this  concern.  The  first 
concerns  the  relatively  slow  reading  rates  demonstrated  for  these  devices 
as  a  class.  It  has  been  noted  that  these  devices  are  relatively  slow  when 
compared  to  the  reading  ability  of  sighted  persons  or  with  the  reading 
speeds  of  blind  persons  using  braille  or  talking  books.  The  point  might 
be  argued  that  for  printed  material  not  available  in  braille  or  talking  books, 
and  in  the  absence  of  a  sighted  assistant,  reading  speeds  with  the  devices 
could  be  considered  relatively  fast. 

The  second  concerns  the  difficulty  of  learning  the  output  which,  of 


The  Battelle  Device 


359 


necessity,  is  code-like.  This  shortcoming,  like  that  of  relatively  slow  read¬ 
ing  rates,  is  a  valid  reason  for  concern,  in  that  the  output  code  of  these 
devices  is  difficult  to  learn. 

The  output  of  direct  translating,  nonintegrating  devices  has  often  been 
compared  with  International  Morse  Code  as  representing  an  analogous 
situation  in  which  slow  reading  or  receiving  rates  and  difficult  learning 
task  obtain.  This  is  possibly  true  in  a  very  general  sense.  It  has  been  noted 
(2)  that  theoretically  the  upper  limit  for  receiving  International  Morse 
should  be  around  60  to  80  words  per  minute,  and  this  calculation  is 
supported  by  maximum  records  achieved  for  reception  of  International 
Morse. 

It  may  well  be  that  the  theoretical  limit  of  optophone-like  devices  is  a 
goal  not  yet  attained  in  evaluations  of  them.  Two  reasons  for  this  might  be 
given.  First,  it  is  entirely  possible  that  the  best  output  for  a  direct  trans¬ 
lating,  nonintegrating  device  has  not  yet  been  discovered  (although  much 
work  has  been  directed  toward  this  problem).  Second,  the  most  efficient 
method  of  instructing  blind  persons  in  the  use  of  the  output  has  not  been 
developed.  Either  of  these  factors  or  a  combination  of  them,  if  improved 
upon,  could  lead  to  a  demonstration  of  average  reading  speeds  that  would 
approach  more  closely  what  might  be  considered  a  theoretical  upper  limit. 

What  is  the  role  of  the  Battelle  device?  The  Battelle  reader  must  be 
considered  in  two  contexts,  one  present  and  one  future.  At  present,  the 
blind  have  access  to  printed  material  through  braille,  talking  books,  sighted 
assistants,  or  a  reading  machine  of  the  direct  translating,  nonintegrating 
type.  The  future  will  offer  these  same  aids,  plus  more  sophisticated  reading 
machines.  The  Veterans  Administration,  for  example,  has  been  active  in 
the  attempt  to  develop  and  evaluate  a  number  of  different  devices  useful 
for  the  blind;  some  of  these  have  been  reading  machines.  Under  their 
sponsorship  a  research  program  to  develop  a  reading  machine  intermediate 
between  direct  translating  and  recognition  types  has  been  under  way  for 
some  time  at  the  Mauch  Laboratories  in  Dayton,  Ohio.  This  machine, 
when  perfected,  will  allow  a  blind  subject  to  read  with  little  or  no  training, 
as  fast  as  is  presently  possible  with  the  Battelle  reader,  and  at  least  twice 
as  fast  with  some  training. 

The  Battelle  device  has  been  developed  for  a  specific  purpose.  It  has 
been  designed  for  use  by  a  segment  of  the  blind  population  who  need 
limited  access  to  printed  material,  not  available  in  other  forms,  for  the 
performance  of  their  specific  task.  This  appropriate  segment  of  the  blind 
population  is  defined  by  those  who  demonstrate  ability  with  the  device 


360  Man-Machine  Systems 

and  who  also  have  a  real  need  for  its  use.  The  Battelle  device  is  not  de¬ 
signed  to  compete  with  or  replace  braille,  talking  books,  or  a  sighted  as¬ 
sistant.  Nor  is  the  Battelle  reader  designed  to  permit  blind  people  to  read 
fiction  for  pleasure. 

At  present  the  Battelle  reader  can  be  useful  for  blind  people  who  need 
limited  access  to  printed  material  not  otherwise  available,  or  access  to 
printed  material  which  in  the  best  interest  of  the  individual  should  be 
read  privately.  The  Battelle  reader  also  serves  a  personal  function  not  ob¬ 
tainable  with  any  other  method.  Contributing  to  this  last  function  are  the 
portability  of  the  device  and  its  probable  selling  price,  which  would  permit 
personal  ownership.  The  device  is  relatively  simple  from  an  engineering 
standpoint,  thereby  keeping  maintenance  requirements  at  a  relatively  low 
level. 

In  the  future  the  same  advantages  would  be  applicable.  Certainly  more 
complex  machines  with  an  easier  output  to  learn,  or  an  output  involving  no 
learning,  will  either  involve  a  considerable  investment  for  the  individual 
blind  user  or  will  be  completely  beyond  his  financial  resources.  Even  the 
recognition  machines  postulated  for  future  use  in  libraries  or  other  central 
institutions  will  occasion  some  personal  inconvenience  for  the  blind  user. 
The  advantages  of  a  personal  reading  machine  are  considerable  even  in 
view  of  their  apparent  disadvantages.  If  the  apparent  disadvantages  of  the 
machines  can  be  minimized  through  further  research,  an  even  greater  con- 
tribution  will  have  been  made  in  assisting  part  of  the  blind  population  to 
realize  their  full  potential. 

The  acceptance  of  the  Battelle  device  has  been  very  good  for  the  ma¬ 
jority  of  the  blind  persons  trained  in  its  use;  they  realize,  however,  it  is 
by  no  means  the  final  answer  to  all  their  reading  needs.  They  do  feel  that 
it  serves  a  useful  purpose,  and  would  continue  to  do  so  until  some  other 
more  useful  method  is  developed. 

REFERENCES 

1.  Coffey,  J.  L.,  E.  H.  Hull,  D.  M.  Metcalfe,  and  L.  J.  Mason,  “The  Further  De¬ 

velopment  of  Aural  Reading  Devices  for  The  Blind,”  in  Battelle  Research  Re¬ 
port,  June,  1961. 

2.  Cooper,  F.  S.,  “Research  on  Reading  Machines  for  the  Blind,”  in  P.  A.  Zahl  (ed.) 

Blindness:  Modern  Approaches  to  the  Unseen  Environment.  Princeton:  Prince¬ 
ton  University  Press,  1950,  pp.  512-543. 

3.  Freiberger,  H.,  and  E.  F.  Murphy,  “Reading  Machines  for  the  Blind,”  I.R.E. 

Professional  Group  on  Human  Factors  in  Electronics,  March  1961,  pp.  8-19. 


THE  USE  OF  THE  REMAINING  SENSORY 
CHANNELS  (SAFE  ANALYZERS) 

IN  COMPENSATION  OF 

VISUAL  FUNCTION  IN  BLINDNESS* 


M.  I.  ZEMTZOVA,  J  .  A.  KULAGIN,  and  L.  A.  NOVIKOVA 
Institute  of  Defectology,  Academy  of  Pedagogical  Sciences,  R.S.F.S.R. 


Modern  society  with  its  technical  progress  demands  from  man  a  high  and 
harmonious  development  of  his  mental  and  physical  abilities. 

Humanistic  treatment  of  blind  people  by  the  Soviet  state  and  Soviet 
people  makes  it  necessary7  to  establish  a  scientific  system  of  education  and 
thorough  development  of  blind  people,  on  the  basis  of  which  they  can  be 
prepared  for  fife  and  work  in  society.  This  problem  is  solved  in  con¬ 
nection  with  the  organization  of  the  universal  secondary  and  nine-year 
education  for  the  blind,  the  organization  of  polytechnical  education  in 
schools  for  the  blind,  the  development  of  the  system  of  professional  train¬ 
ing  on  the  basis  of  which  the  vocational  problems  for  the  blind  can  be 
solved. 

At  the  present  time  all  blind  children  of  school  age  are  in  a  system  of 
universal  education.  State  schools  for  the  blind  provide  the  pupils  with  a 
general  educational  program  equal  to  the  program  of  normal  school,  poly¬ 
technical,  aesthetical,  physical  and  moral  education,  and  prepare  them 
for  different  types  of  professional  activities.  Blind  people  in  the  Soviet  Union 
work  in  different  fields  of  industry',  using  different  types  of  technical  equip¬ 
ment  and  modem  progressive  methods  of  work;  many  of  them  have  been 
very  successful  in  intellectual  work. 

Investigation  of  the  abilities  of  blind  people  became  very  important  in 
connection  with  new  perspectives  of  further  development  of  general,  poly- 

*  This  paper  was  submitted  in  English  for  the  International  Congress  on  Tech¬ 
nology  and  Blindness  to  be  included  in  these  Proceedings.  Several  minor  stylistic 
changes  in  the  paper  have  been  made  to  render  its  English  more  idiomatic.  Where 
words  are  substituted,  the  original  is  put  in  parentheses,  so  that  both  sense  and 
original  form  are  retained — Ed. 


361 


362  Man-Machine  Systems 

technical  and  professional  education  of  the  blind.  People  of  different  pro¬ 
fessions — teachers,  psychologists,  physiologists,  clinicists,  engineers,  and 
hygienists  are  involved  in  such  kinds  of  investigation. 

The  present  investigation  represents  a  part  of  general  work  in  the  field 
of  compensation  of  blindness  and  the  development  of  blind  people’s  per¬ 
sonality.  We  shall  discuss  here  principles  and  mechanisms  of  the  use  of  the 
remaining  sensory  channels  and  correlative  sensory  aids  (safe  analyzers) 
in  compensation  of  the  visual  function  in  blindness.  The  understanding  of 
mechanisms  of  compensation  in  blindness  helps  us  to  control  them,  to 
create  rational  methods  of  education,  to  organize  corrective  educational 
work  preventing  some  secondary  negative  results  of  blindness,  and  to  de¬ 
velop  new  methods  of  the  perception  of  the  world  by  the  blind. 

The  investigation  of  the  use  of  the  remaining  senses  (safe  analyzers) 
by  the  blind  is  also  of  great  importance  for  designing  and  constructing 
different  technical  devices  for  the  blind  (reading  machines,  measuring  in¬ 
struments,  different  types  of  models  for  education,  devices,  helping  the 
blind  in  labor  and  space  orientation,  etc. ) . 

For  a  long  time  the  problems  of  physiological  mechanisms  and  prin¬ 
ciples  of  substituting  visual  functions  in  the  blind  have  drawn  the  attention 
of  many  investigators.  But  it  is  necessary  to  mention  that  these  investiga¬ 
tions  were  not  based  on  a  sufficient  theoretical  and  experimental  analysis 
which  sometimes  brought  the  authors  to  contradictory  conclusions.  The 
attention  of  many  investigators  was  concentrated  on  the  problem  of  what 
mechanisms  and  functions  became  deranged  after  the  loss  of  vision,  while 
the  most  important  task  is  to  investigate  what  remains  unimpaired  (safe) 
and  how  the  compensation  of  lost  functions  goes  on. 

Our  investigation  was  based  on  the  teaching  of  I.  P.  Pavlov  and  I.  M. 
Setchenov  and  on  the  new  theories  of  compensation  developed  by  Soviet 
scientists,  P.  K.  Anokhin  (1),  E.  A.  Asratyan  (2),  L.  S.  Vygotsky  (15), 
A.  M.  Lomkina  (7),  A.  N.  Leontiev  (6),  A.  R.  Luria  (8),  and  others. 
In  our  investigation  we  tried  to  make  clear  how  the  rearrangement  of  func¬ 
tions  of  the  analyzers  develops  after  the  loss  of  vision,  which  plays  such 
an  important  role  in  the  life  of  man. 

The  investigation  was  carried  out  on  the  basis  of  synthesis  of  many 
years  of  observation  of  blind  people  in  the  course  of  the  process  of  edu¬ 
cation  and  work,  psychological,  physiological  and  educational  experiments, 
organization  of  experimental  educational  programs  in  the  schools  for  the 
blind,  observations  of  the  labor  activity  of  the  blind  in  different  spheres  of 
manual  and  intellectual  work.  In  exploring  physiological  mechanisms  of 
compensation,  electrophysiological  methods  were  widely  applied.  Theoreti- 


363 


Use  of  Safe  Analyzers 

cal  analyses  of  the  modem  theories  of  compensation  and  the  results  of 
our  own  investigations  permitted  us  to  formulate  some  principles  of  com¬ 
pensating  functions  after  the  loss  of  vision. 

In  the  light  of  modem  theories  of  interaction  and  interchange  of  dif¬ 
ferent  cortical  sensor}7  centers  (analyzers),  the  processes  of  compensation 
are  realized  on  the  basis  of  normal  physiological  mechanisms,  i.e.,  the 
mechanism  of  analvzers  and  the  mechanism  of  cortical  associative  function 
working  on  the  principle  of  the  conditioned  reflex.  With  the  help  of  these 
mechanisms  the  perception  of  information  from  the  outer  world,  its  pro¬ 
cessing,  selection,  and  usage  in  the  course  of  human  activity,  is  going  on. 

An  important  role  in  the  processes  of  compensation  belongs  to  the  feed¬ 
back  mechanism,  which  senes  for  evaluation  and  correction  of  the  results 
of  performed  action  with  the  help  of  auditory,  somesthetic.  and  other 
analyzers  used  in  the  process  of  human  activity. 

The  loss  of  vision  leads  to  change  in  all  the  afferent  systems  and  in  the 
cortical  neurodynamics.  As  a  result  of  limitation  of  afferentation  after  the 
vision  loss  a  broad  irradiation  of  nervous  processes  and  the  involvement  of 
different  trace  connections,  which  are  of  secondary  importance  in  the 
presence  of  vision,  are  observed.  A  normal  human  organism  possesses  great 
reserves  w7hich  are  not  used  or  used  only  to  a  limited  extent  in  normal  con¬ 
ditions  with  the  presence  of  the  visual  system.  In  the  course  of  activity  of 
people  with  a  vision  loss  a  wide  use  of  the  remaining  sensor}'  channels 
(safe  analyzers)  becomes  necessary.  These  analyzers  provide  them  with 
different  types  of  information  which  in  the  presence  of  vision  is  excessive. 
Different  organs  and  systems  can  be  used  as  reserve  sources  for  signal 
reception.  Multiple  signal  reception  from  different  sensor}'  channels  (ana¬ 
lyzers)  provides  conditions  for  the  formation  of  complex  dynamic  systems 
of  connections  which  play  an  important  role  in  compensation  for  blind¬ 
ness.  In  the  early  period  of  blindness  a  wide  irradiation  of  neural  (nervous) 
processes  takes  place.  This  creates  favorable  conditions  for  the  formation 
of  conditioned  interconnections. 

The  change  in  the  sensor}'  systems  (system  of  analyzers)  after  onset 
of  blindness  (vision  loss)  is  not  limited  to  the  change  of  some  isolated 
functions  but  involves  the  whole  central  nervous  system.  The  development 
of  processes  of  compensation  evokes  a  change  of  the  type  of  intra-analyzer 
connections  and  mechanisms  of  cortical  regulation.  In  the  absence  of  vision 

w- 

the  mechanisms  of  cortical  regulation  are  based  on  the  extension  of  the 
use  of  hearing,  tactile,  motor,  and  other  systems  of  sensory  analysis  (safe 
analyzers)  which  have  a  compensator}'  function  in  blindness. 

Afferentation  of  the  cortex  in  blindness  is  supplied  not  only  through 


364  Man-Machine  Systems 

the  use  of  signals  from  receptor  organs  but  also  through  previously  formed 
and  stored  dynamic  systems  of  interconnections.  In  persons  who  used  vision 
before  they  became  blind  the  traces  of  their  visual  experience  are  included 
in  the  patterned  interconnection  (systems  of  connections).  This  enriches 
their  impression  of  the  outer  world  and  becomes  an  important  factor  in 
their  spatial  orientation. 

Loss  of  vision  leads  to  important  changes  in  the  dynamics  of  spatial 
orientation  (orienting  reactions).  The  orienting  reactions  in  the  blind  are 
characterized  by  different  relations  between  cortex  and  subcortical  struc¬ 
tures.  The  reticular  formation  becomes  a  source  of  strong  excitatory  im¬ 
pulses,  stimulating  cortex  and  maintaining  the  level  of  its  excitability  (7). 
Such  type  of  change  in  nervous  processes  can  be  observed  at  the  first  stage 
of  blindness  or  in  the  conditions  of  orientation  which  are  exceptionally 
difficult  for  the  blind.  The  structure  of  spatial  orientation  (orienting  reac¬ 
tions)  is  not  fixed  and  can  change  in  connection  with  the  requirements  of 
the  external  conditions  (9). 

Many  authors  have  shown  that  of  higher  mental  functions  speech  and 
thinking  are  most  important  for  developing  compensation  in  the  blind. 
Different  processes  of  rearrangement  and  substitution  of  functions  in  blind¬ 
ness  do  not  appear  spontaneously  as  a  result  of  biological  predetermination 
but  are  formed  in  active  voluntary  activity  of  man  and  depend  upon  the 
contents  and  conditions  of  this  activity.  The  mastery  in  the  course  of  edu¬ 
cation  of  social  experience  which  is  realized  in  active  labor,  activity,  in 
language  as  a  means  of  intercourse  and  learning,  in  the  products  of  labor, 
in  the  social  forms  of  life  plays  an  important  role  in  the  appearance  and 
rearrangement  of  nervous  processes.  The  mastery  of  social  experience  with 
the  help  of  language  in  the  course  of  education  is  an  important  factor  in 
the  development  of  higher  functions  in  the  blind:  perception,  voluntary 
attention,  creative  and  reproductive  imagination,  logical  memory  and  ab¬ 
stract  thinking. 

For  understanding  mechanisms  and  ways  of  compensation,  we  shall 
analyze  certain  experimental  data. 

The  physiological  basis  of  psychological  processes  is  provided  by  the 
mechanisms  of  cortical  activity.  To  show  in  what  way  these  mechanisms 
change  after  the  vision  loss  we  investigated  electrical  activity  of  the  brain  in 
150  blind  people. 

It  was  shown  that  the  electroencephalogram  (EEG)  of  the  blind  differs 
greatly  from  99  percent  of  the  sighted  people  and  was  characterized  by  the 
absence  or  poor  expression  of  the  alpha  rhythm,  which  dominated  in  the 


Use  of  Safe  A  nalyzers  365 

EEGs  of  sighted  people.  Together  with  the  absence  of  the  alpha  rhythm 
a  pronounced  depression  of  electrical  activity  in  the  blind  can  be  shown 
(Figure  1). 

On  the  basis  of  investigation  of  blind  people  as  well  as  experiments  on 
animals  it  is  possible  to  conclude  that  the  specific  EEG  of  the  blind  is  a 
manifestation  of  the  lowered  level  of  cortical  excitation,  resulting  from 


n _ l£M—r. 


Figure  1  Electroencephalogram  of  Blind  Subject  Showing  Absence  of 
Alpha  Rhythm  and  Pronounced  Depression  of  Electrical  Activity 


the  absence  of  visual  afferentation  input.  Thus  the  complex  reflective  ac¬ 
tivity  in  the  blind  is  carried  out  against  a  background  of  certain  lowering 
of  the  level  of  cortical  excitation. 

In  connection  with  the  problem  of  sensory  compensation,  it  is  interesting 
to  note  that  the  depression  of  electrical  activity  in  the  blind  is  most  expressed 
in  the  occipital  region  of  the  cortex  (Figure  2).  In  the  sensory  motor  region 


o&jyoai j,  / Sur\o*gp*J 


I 


3a7Wjio*<K  o&  AOJi 


IjmTp&ut  oSa  /Sti/io^V 


Wv 


— U  "  If  U - U - U - VJ  \J  \J  u  u* 

Figure  2  The  Depression  of  Electrical  Activity  is  Most  Pronounced  in  the 
Occipital  Region 


of  the  cortex  in  the  blind,  as  well  as  in  the  people  with  useful  (safe)  vision, 
M  rhythm  (Rolandic  rhythm)  can  often  be  registered  (Figure  3).  It  is 
important  that  Rolandic  rhythm  augmented  under  the  influence  of  proprio¬ 
ceptive,  tactile,  and  sound  stimulation  should  spread  to  the  occipital  region 
of  the  cortex  (Figure  4).  This  fact  testifies  to  the  preservation  of  the  inter- 


366  Man-Machine  Systems 


■StmtsmtM  fin 


14  tump.  tiA  1  Ail  I 

W «^j'VWVVvJ 
I  SI  my! 

AU»  tii  fttij 


J 'V'^JV^W^wVyVV. 


lilt _ n_ 


-TL. 


Figure  3  M  Rhythm  or  Rolandic  Rhythm  in  a  Blind  Subject 


tj€*rp<xAWt.  o&»..  a«c» 


«%•< 


Figure  4  Spread  of  Rolandic  Rhythm  to  Occipital  Region  Under  the  In¬ 
fluence  of  Proprioceptive,  Tactile,  and  Sound  Stimulation 


central  connections  of  the  occipital  cortex  with  the  other  regions  and  can 
be  used  for  explanation  of  mechanisms  and  processes  of  compensation  in 
the  blind,  providing  additional  afferent  input  on  the  basis  of  the  use  of  other 
analyzers. 

Experiments  on  rabbits  showed  that  the  lowering  of  the  cortical  exci¬ 
tation  level  resulting  from  absence  of  visual  afferent  input,  evokes  disinfla¬ 
tion  of  subcortical  structures,  particularly  of  the  reticular  formation  (Figure 
5). 

Thus  the  peculiarity  of  the  neurodynamics  among  the  blind  manifests 
itself  not  only  in  the  lowering  of  the  cortical  excitation  but  also  in  the 
increase  of  the  level  of  excitation  in  subcortical  structures.  This  increase 
of  the  excitation  level  in  subcortical  structures  serves  as  one  of  the  sources 
of  compensatory  excitation  of  the  cortex  and  explains  several  specific  fea¬ 
tures  of  vegetative  reactions  in  the  blind. 

In  the  experiments  on  blinded  animals,  complex  relations  between  the 
cortex  and  the  reticular  formation  were  observed.  Several  months  after  the 
blinding  of  a  rabbit,  when  the  level  of  the  electrical  activity  in  the  cortex 
became  equal  to  30  to  40  percent  of  the  initial  level,  a  strong  excitation 
of  the  reticular  formation  was  observed.  In  the  period  of  maximum  exalta- 


Use  of  Safe  Analyzers 


367 


£5  O/tv' 


\  ««vvvvS(1>i,^VW(.  vW^W/1,. 

pcmuK.  ocJ/>  cpedneto  Hojio  *  •vr**  ..  ~ 


'MV*' 


*"  V 


VV*„ 


r 


|3  oh6u  rneMwjmu 


Figure  5  Disinhibition  of  Subcortical  Structure  in  a  Rabbit  EEG  Result¬ 
ing  from  Absence  of  Visual  Afferent  Input 


tion  in  the  reticular  formation,  a  certain  augmentation  of  the  cortical  excita¬ 
tion  level  was  noticed.  Later,  after  the  blinding  of  the  animal,  the  normali¬ 
zation  of  interrelations  between  the  cortex  and  subcortical  structures  takes 
place. 

In  a  series  of  experiments  the  electrical  activity  of  the  brain  after 
enucleation  of  the  eyes  and  keeping  the  animal  in  absolute  darkness  for 
several  months  were  compared.  It  appeared  that  prolonged  keeping  in 
darkness  resulted  in  the  same  lowering  of  the  amplitude  of  the  cortical 
electrical  potentials  as  after  the  enucleation.  After  bringing  the  sighted  ani¬ 
mal  into  conditions  of  normal  illumination,  the  level  of  exitation  in  the 
cortex  grew  continuously,  eventually  returning  to  the  initial  state  (Figures 
6  and  7).  On  the  basis  of  these  observations  a  hypothesis  can  be  proposed 
that  the  changes  of  the  electrical  rhythms  in  the  cortex  of  the  blind  are 
based  not  on  the  morphological  (structural)  but  on  functional  changes. 


368 


Man-Machine  Systems 


A 


0Off 


M  \SOfi,  ' 


u>  IT  Ci/thOK  fie/w0/v* 


cyrftOK 


3  Qcy/ftor  meMMorny 


Figure  6  Lowering  of  Amplitude  of  Cortical  Electrical  Potential  As  a 
Result  of  Prolonged  Darkness  Over  a  Period  of  Several  Months 


Use  of  Safe  A  nalyzers 


369 


zam  \so/*r 


me fi  I*r 


men 


Cysf>or  o  c&tuf  <•  *uj 


Figure  7  Restoration  of  Normal  Amplitudes  of  Cortical  Electrical  Poten¬ 
tial  When  Animals  Are  Restored  to  Conditions  of  Normal  Illumination 


370  Man-Machine  Systems 

These  observations  are  of  interest  for  the  problems  of  rehabilitation 
of  the  visual  function. 

From  the  experiments  described  above  it  is  obvious  that  loss  of  vision 
evokes  important  neurodynamic  changes,  which  represent  compensatory 
rearrangements  in  the  brain. 

The  rearrangement  of  the  central  sensory  (analyzers)  functions  in  man 
develops  in  the  course  of  practical  activity  and  depends  upon  its  contents 
and  conditions.  The  improvement  of  analyzers  and  synthesizers  is  selective. 
Thus,  when  learning  Morse  code  or  in  using  reading  machines  with  auditory 
output,  the  auditory  (hearing)  analyzers  and  synthesizers  develop.  When 
learning  is  based  on  the  braille  system,  stenography,  graphical  alphabet, 
the  tactual  perception  becomes  improved.  The  development  of  the  analyzers 
and  synthesizers  results  not  from  the  elementary  sensory  functions  but  from 
systemic  activity  of  central  nervous  functions  (safe  analyzers).  The  de¬ 
velopment  of  the  remaining  sensory  channels  (analyzers  and  synthesizers) 
is  based  on  the  formation  of  both  intra-analyzer  and  intra-analyzer  con¬ 
nections. 

Comparative  study  of  changes  with  age  of  sensory  functions  in  blind 
children  and  children  with  normal  vision  showed  that  there  are  trends  in 
the  development  of  sensory  channels  (analyzers),  depending  upon  the  age: 
with  age,  both  in  blind  and  sighted  children,  sensitivity  rises  to  a  certain 
level  (5). 

In  this  process  the  blind  are  able  to  obtain  a  higher  level  of  development 
of  tactile  and  hearing  perception  than  that  in  people  with  normal  sight.  It 
results  from  the  fact  that  in  the  course  of  education  blind  people  more  often 
use  tactile  and  hearing  perception  and  this  provides  conditions  for  the 
development  and  improvement  of  tactile  and  hearing  analyzers  and  syn¬ 
thesizers.  In  appropriate  conditions  people  with  normal  sight  are  also  able 
to  obtain  a  high  level  of  development  of  different  sensory  functions. 

Different  sensory  channels  (analyzers)  play  unequal  roles  in  the  pro¬ 
cesses  of  compensation.  For  the  perception  of  the  objects  of  the  outer  world 
tactile  and  proprioceptive  analyzers  are  most  important.  With  the  help  of 
these  channels  (analyzers)  the  blind  immediately  perceive  the  objects  of 
their  environment.  Kinesthetic  and  tactual  perception  also  play  important 
roles  in  the  feedback  mechanism,  providing  signalization  from  the  periph¬ 
eral  organs,  on  the  basis  of  which  estimation  and  correction  of  movements 
in  labor  activity,  walking,  sport,  etc.,  can  be  carried  out  by  blind  people. 

The  significance  of  tactile  perception  for  compensation  in  blindness 
has  been  noted  by  I.  M.  Setchenov:  “The  hand  feeling  objects  gives  to  the 


U se  of  Safe  A  nalyzers  371 

blind  everything  that  we  receive  through  the  eye,  except  the  colors  of 
objects  and  the  perception  at  a  distance  ”  (11).  I.  M.  Setchenov  supposed 
that  the  “image”  of  tactile  perception  and  the  image  in  visual  perception 
have  a  basic  likeness  in  their  contents  insofar  as  they  represent  in  the 
human  brain  the  objects  and  phenomena  of  the  outer  world. 

The  theory  of  I.  M.  Setchenov  on  the  interrelationship  among  sensory 
channels  (intersubstituting  functions  of  analyzers)  and  their  interaction  is 
very  important  in  understanding  the  process  of  transfer  among  sense 
modalities  (of  use  of  the  safe  analyzers)  in  blindness. 

Soviet  and  foreign  authors  have  repeatedly  discussed  the  question  how 
and  to  what  extent  tactile  perception  can  substitute  for  vision  in  the  per¬ 
ception  of  objects  and  phenomena  of  the  outer  world?  It  is  known  that 
with  the  help  of  tactile  perception  as  well  as  with  the  help  of  vision,  it  is 
possible  to  distinguish  different  features  characterizing  objects:  their  struc¬ 
ture,  form,  size,  etc.  All  these  features  are  synthesized  and  as  a  result  of  it, 
a  detailed  “image”  of  the  objects  is  created.  It  was  shown  that  on  the  basis 
of  interaction  and  intersubstitution  of  sense  modalities  (analyzers)  the  for¬ 
mation  of  the  image  can  be  carried  out  on  the  basis  of  very  limited  sensory 
information — even  with  the  absence  of  vision,  hearing  and  speech.  In  these 
cases,  sensory  input  (afferentation)  is  provided  by  tactile  and  kinesthetic 
cues  and  by  the  transfer  of  sensory  traces  in  the  neuropile  wherein  sub¬ 
stitute  sensory  processes  are  developed  on  the  principle  of  the  conditioned 
reflex  (13,  16). 

Tactual  perception  in  the  blind  develops  according  to  the  same  laws 
as  vision  and  other  forms  of  perception.  Experiments  with  the  perception 
of  figures  evoking  visual  illusions  provide  good  examples.  The  cases  of 
perceptual  illusions  are  the  expression  of  connective  activity  which  in  every 
day  life  provide  correct,  adequate  perception  of  the  outer  world  and  only 
in  artificial  conditions,  created  by  specially  chosen  figures,  materials  create 
the  effect  of  illusion. 

In  our  experiments  the  figures  in  relief  evoking  visual  illusions  were 
presented  to  40  pupils  of  the  fourth  and  senior  grades  of  the  school  for  the 
blind.  Among  the  pupils  19  were  totally  blind  and  21  possessed  the  remnants 
of  vision  (up  to  0.05).  Tactile  perception  of  the  figure  was  carried  out 
by  the  moving  hand.  Each  figure  was  presented  only  once. 

During  the  tactile  perception  of  the  figure  evoking  the  Miiller-Lyer 
illusion  (Figure  8),  illusion  was  obtained  in  30  pupils  of  the  school  for  the 
blind.  Ten  pupils  perceived  the  segments  as  equal  by  measuring  them  with 
the  help  of  their  fingers.  To  this  group,  the  figures  were  presented  again, 


372  Man-Machine  Systems 

this  time  the  measurements  being  prohibited.  In  these  conditions,  the  Miiller- 
Lyer  illusion  was  observed  in  all  the  pupils.  This  fact  provides  a  good 
illustration  for  Setchenov’s  concept  of  the  role  of  the  movement  of  a  feeling 
hand  being  similar  to  that  of  a  seeing  eye. 

The  results  of  experiments  with  Miiller-Lyer’s  illusion  coincide  with 
the  data  described  by  G.  Revesz  ( 10) . 


Figure  8  The  Miiller- 
Lyer  Illusion 

We  shall  also  describe  the  experiments  with  a  figure  evoking  the  illusion 
of  perspective  which  seems  to  be  a  specific  value  for  vision  and,  presumably 
because  of  that,  was  not  investigated  by  G.  Revesz. 

Seven  pupils  with  total  blindness  and  eight  pupils  with  remnants  of 
vision  evaluated  the  vertical  segments  in  Figure  9  for  nonequality.  Such 
results  were  obtained  because  in  our  schools  the  relief  figures  and  diagrams 
are  widely  used.  In  these  figures  there  are  some  elements  which  are  arranged 
in  accordance  with  the  laws  of  visual  perspective.  Practice  of  perception  of 
such  figures  by  blind  children  resulted  in  the  formation  of  appropriate 
nervous  mechanisms — systems  of  conditioned  connections,  corresponding 
to  the  laws  of  perspective  drawing  for  the  sighted. 


Figure  9  Perspective  Il¬ 
lusion  Test  Pattern.  See 
text  for  details. 


U se  of  Safe  A  nalyzers  373 

Contrary  to  the  conclusion  made  by  G.  Revesz  in  the  above-mentioned 
paper  about  the  “inborn  nature”  of  perceptual  illusions  our  experiments 
show  the  dependence  of  the  presence  of  illusion  upon  the  practice  of  per¬ 
ception.  Among  the  blind  children  participating  in  the  experiments,  there 
could  be  distinguished  a  group  which  showed  the  presence  of  different  illu¬ 
sions.  These  pupils  also  possessed  the  most  developed  tactile  perception. 

The  role  of  practice  for  the  development  of  perception  can  also  be 
shown  by  the  following  experiment.  It  is  well  known  that  the  Charpantier 
illusion  can  be  observed  in  the  blind  (K.  Biirklen  [3]).  In  our  experiments 
the  pupils  of  the  fourth  year  of  education  and  older  groups  showed  the 
presence  of  this  illusion  in  the  conditions  of  tactile  perception  of  comparable 
weights,  i.e.,  on  the  basis  of  determination  of  their  volumes.  If  the  tactile  per¬ 
ception  is  excluded  (lifting  the  weights  by  the  attached  strings)  the  illusion 
vanishes,  as  with  sighted  people  with  closed  eyes.  At  the  same  time,  experi¬ 
ments  with  the  pupils  of  the  second  and  third  year  of  education  in  the 
school  for  the  blind  (16  pupils)  showed  the  absence  of  the  illusion  of  the 
weight.  Thus,  as  a  result  of  restricted  practice  of  perception,  the  neural 
mechanism  underlying  the  Charpantier  illusion  in  the  blind  children  forms 
only  at  the  age  of  1 1  to  12,  while  in  the  children  with  useful  vision  it  already 
exists  at  5  to  6  years. 

The  same  16  blind  pupils  of  the  second  and  third  year  of  education 
were  presented  with  a  relief  figure  evoking  the  Muller-Lyer  illusion.  In  three 
of  them,  the  illusion  was  absent.  These  pupils  also  showed  the  absence  of 
the  illusion  of  weight.  The  tactile  image  gives  adequate  representation  of 
reality  but  it  is  much  poorer  than  the  visual  image.  That  is  why  in  creating 
the  “image”  of  the  object,  the  blind  use  logical  constructs  developed  from 
language  so  extensively.  The  question  arises  whether  the  perception  of  the 
blind  is  based  on  the  abstract  schemata  or  whether  it  is  based  on  the  con¬ 
crete  images.  In  this  respect  the  investigations  of  the  topographic  images  in 
the  blind  are  very  interesting.  They  showed  that  the  blind,  as  well  as  people 
with  normal  sight,  are  capable  of  making  a  correct  mental  reproduction  of 
the  topographical  map.  showing  the  position  of  different  objects  in  space 
in  their  mutual  relations  and  in  their  relation  to  the  perceiving  subject.  They 
can  easily  determine  the  directions  of  the  geographical  points,  surrounding 
them  nearly  with  the  same  degree  of  accuracy  as  people  with  normal  sight 
can  do.  The  images  arising  in  these  conditions  in  the  blind  are  not  schematic 
but  are  characterized  by  the  specific  “motor  explicitness.” 

Types  of  the  images  in  people  with  or  without  sight  are  different  but 
their  content  is  common  to  both  because  they  both  represent  the  objectively 


374  Man-Machine  Systems 

existing  reality  (4,  12).  The  important  role  in  the  evaluation  of  position  and 
direction  of  geographical  objects  belongs  to  the  searching  apparatus  (ef¬ 
fectary  systems)  provided  by  the  eye  movement  in  the  normally  seeing 
person  when  looking  around.  In  the  blind  the  eye  movement  is  substituted 
by  the  movement  of  the  feeling  hand.  In  persons  who  previously  used 
vision,  in  the  process  of  estimating  the  position  and  direction  of  objects, 
the  whole  visual-kinesthetic  dynamic  complexes  underlying  a  visual  image 
and  formed  in  the  course  of  individual  experience  are  reproduced.  In  these 
cases  body  position  (a  pose),  position  of  the  head,  and  “adjustment  of  the 
eyes”  typical  of  people  with  sight  are  observed.  The  person  who  lost  vision 
behaves  as  if  he  visually  observes  the  objects  placed  in  different  directions 
without  any  stimulation  of  the  eye’s  retina. 

Investigations  of  the  perception  of  sculptural  portraits  by  the  blind  are 
of  great  interest.  People  who  were  born  blind,  who  never  used  vision,  as  a 
result  of  tactile  perception  of  the  sculpture  and  subsequent  verbal  descrip¬ 
tion  of  the  image  can  correctly  evaluate  the  shape  of  the  face,  eyes,  nose, 
mouth,  head,  and  body  position  (the  pose),  express  movement,  significance 
of  gestures,  features  of  the  portrait  resemblance,  and  expression  of  emotions 
(14).  Many  blind  people  have  created  remarkable  sculptures.  Especially 
high  level  of  perfection  in  the  art  of  sculpture  can  be  achieved  by  people 
who  became  blind  later  in  life.  The  sculptress,  Lona  Po,  who  lost  her  vision 
as  an  adult,  created  remarkable  sculptures  in  which  she  skillfully  represents 
human  emotions,  movement,  shapes,  proportions,  grace  (Figures  10  and 
11).  She  created  many  sculptures  possessing  high  sacral  significance  ( The 
Heroic  Deed,  The  Guerillas,  The  Woman  with  the  Label,  etc.).  Some  of  her 
sculptures  are  exhibited  in  the  Tretyakov  picture  gallery. 

Very  interesting  are  the  sculptures  of  the  blind  Malkovsky  (German 
Democratic  Republic).  His  sculpture  Seeing  Hands  is  especially  impressive. 
In  our  schools  for  the  blind  acquaintance  with  sculpture,  together  with  the 
broad  musical  education,  is  an  important  means  of  aesthetic  education.  An 
important  role  is  played  in  the  compensation  of  blindness  by  special 
graphical  methods;  relief  designing  and  relief  drawing  are  widely  used  in 
schools  for  the  blind. 

Common  features  of  visual  and  tactual  perception  enable  us  to  use  in 
special  representation  in  relief  (graphics)  many  different  types  of  repre¬ 
sentation  used  in  inkprint  two-dimensional  reproductions  (flat  diagrams 
and  drawings).  But  in  making  relief  drawings,  the  specific  features  of  tactual 
perception  must  also  be  taken  into  account. 

Orthogonal  projections  can  be  used  with  the  addition  of  some  special 


Use  of  Safe  Analyzers 


375 


Figure  10  Sculpture  by  Lona  Po 


376 


Man-Machine  Systems 


Figure  1 1  Sculpture  by  Lona  Po 


notation  for  relief  drawing.  Pupils  can  not  only  learn  to  read  and  reproduce 
relief  designs  correctly  but  can  also  draw  the  design  of  the  object  them¬ 
selves  or  reproduce  the  object  in  three  dimensions  with  the  use  of  air-setting 
plastic  material  (Figure  12). 

Relief  drawings,  unlike  the  designs,  represent  some  qualities  of  the  object 
obvious  for  tactual  perception  especially  its  shape.  In  schools  for  the  blind 
different  types  of  relief  drawings,  applying  different  methods  of  representa¬ 
tion  of  objects  (contour,  application  low  relief)  are  very  popular  in  the 
preparation  of  relief  drawings  because  the  elements  of  visual  perspective 
are  widely  used  (Figure  13). 

Good  results  were  obtained  in  the  course  of  experimental  training  of 
blind  children  in  understanding  relief  drawings  with  the  help  of  the  method 
of  nonuniform  dot  covering  and  perspective  change  of  the  size.  In  such  a 
drawing  all  the  surface  of  the  object  represented  is  covered  with  relief  dots, 
the  density  of  which  is  higher  the  nearer  the  given  segments  of  the  surface 
are  to  the  person  tactilely  perceiving  the  object.  By  the  nonuniform  density 


U se  of  Safe  A  nalyzers  377 


Figure  12  Three  Dimensional  Recreation  of  Relief  Objects  Produced  by 
Pupils  in  Air-Setting  Plastic 


of  the  dots  the  shape  of  the  surface  of  any  complexity  can  be  reproduced. 
The  relief  drawing  is  made  in  correspondence  with  the  laws  of  visual  per¬ 
spective.  In  explanation  of  this  method  of  representation  to  the  pupils  the 
laws  of  vision  need  not  be  referred  to;  everything  can  be  understood  just  on 
the  basis  of  tactual  perception. 

Experimental  training  was  performed  using  the  flat  geometrical  figures 
which  can  be  placed  at  different  angles  of  rotation  for  tactile  perception. 
Relief  drawings  of  these  figures  were  prepared  for  six  different  angles  of 
rotation. 

The  training  sessions  of  blind  pupils  were  repeated  many  times.  By  the 
fourth  lesson  the  blind  children  made  a  correct  choice  of  the  drawing, 
corresponding  to  a  specified  angle  of  the  figure  rotation.  A  sharp  decline 
in  errors  can  be  achieved  by  letting  the  pupils  themselves  set  the  angle  of 


378 


Man-Machine  Systems 


Figure  13  Relief  Drawing  Produced  by  a  Pupil  in  a  School  for  the  Blind 
and  the  Object  from  which  it  was  Produced 


U se  of  Safe  A  nalyzers  379 

rotation  of  the  figure  on  the  basis  of  the  drawing.  If  at  the  first  lesson,  the 
mistake  is  equal  approximately  to  15  degrees,  by  the  sixth  or  seventh  lesson 
it  is  reduced  to  3  to  4  degrees. 

After  training  with  the  flat  figures  the  pupils  were  given  the  drawings 
of  three  dimensional  objects.  They  correctly  modeled  these  objects  in  air- 
setting  plastic  material,  correctly  determining  the  size  of  the  objects  which 
were  represented  in  the  drawing  at  different  distances. 

The  use  of  relief  drawings  and  diagrams  shows  great  possibilities  of  the 
development  of  the  complex  forms  of  tactual  channels  (synthesizers  and 
analyzers)  in  blind  children. 

Hearing  perception,  closely  connected  with  the  development  of  speech 
and  thinking,  is  of  great  importance  for  the  formation  of  images  in  the  blind. 
With  the  help  of  speech  the  blind  can  communicate  with  people  around 
them,  learn  to  read,  write,  draw,  receive  new  information  from  books,  hear 
stories  told  by  seeing  people  and  from  other  sources  (music,  lectures, 
radio,  theatre,  etc.).  All  this  helps  their  intellectual  development. 

Discussing  the  role  of  the  remaining  sensory  channels  (safe  analyzers) 
in  the  substitution  of  the  visual  function  in  the  blind,  it  is  necessary  to 
mention  investigations  of  obstacle  perception  by  the  blind.  Comparative 
experiments  have  shown  that  discrimination  of  distantly  placed  objects 
which  produce  no  sounds  is  not  characteristic  for  the  blind.  Under  certain 
conditions  sighted  people  are  also  able  to  perceive  objects  at  a  distance 
without  the  use  of  vision.  But  because  of  the  necessity  of  using  this  kind 
of  perception  permanently,  the  blind  use  it  with  a  high  degree  of  perfection. 
With  specially  organized  training  this  process  of  perfection  is  attained  more 
rapidly. 

In  obstacle  perception  at  a  distance  complex  afferent  systems  (hearing, 
kinesthetic,  tactile  temperature  analyzers,  and  others)  are  involved.  Com¬ 
plex  synthetic  stimuli  and  signals  from  different  channels  (analyzers)  help 
the  blind  to  represent  their  environment  correctly  and  thus  to  orient  them¬ 
selves  in  space.  The  preponderant  role  in  this  process  belongs  to  the  auditory 
channel  (hearing  analyzer).  In  the  course  of  orientation  the  blind  perceive 
the  slightest  changes  of  pitch,  loudness,  timbre  (quality  of  the  sound),  and 
the  direction  of  its  source.  The  predominant  signaling  role  in  this  process 
belongs  to  reverberated  sounds.  Making  use  of  them,  the  blind  are  able  to 
perceive  objects  at  a  distance.  These  sounds,  as  components  in  the  com¬ 
plex  system  of  different  connections,  provide  an  important  means  of  orienta¬ 
tion  for  the  blind  during  movement. 

In  summary  we  have  shown  that  after  vision  loss,  different  unused 


380  Man-Machine  Systems 

capacities  of  the  remaining  sensory  channels  (reserve  functions  of  the  safe 
analyzers)  are  mobilized.  This  permits  the  blind  to  achieve  a  high  level  of 
perfection  in  formation  of  spatial  perception  and  cognitive  processes  (logic 
thinking). 

Continuous  and  frequent  use  of  the  remaining  sensory  channels  (safe 
analyzers)  in  the  process  of  learning  and  labor  provides  favorable  conditions 
for  a  thorough  development  of  the  personality  and  for  the  adjustment  of  the 
blind  to  life  and  work  in  modem  society. 

REFERENCES 

1.  Anokhin,  P.  K.,  ‘‘The  General  Principles  of  the  Compensation  of  Functions  and 

Their  Physiological  Significance.”  Thesis  presented  at  the  VII  All-Union  Con¬ 
gress  of  Physiologists,  1955. 

2.  Asratyan,  E.  A.  The  Physiology  of  the  Central  Nervous  System.  Moscow,  1953. 

3.  Biirklen,  K.  Blindenpsychologie.  Leipzig,  1924. 

4.  Hapreninova,  N.  G.,  “The  Peculiarities  of  the  Cognitive  Activity  in  the  Blind,” 

in  On  the  Problems  of  the  Perception  of  Direction  in  the  Blind.  Moscow,  1958. 

5.  Kekcheiev,  K.  N.  Interoception  and  Proprioception  in  Their  Clinical  Significance. 

Moscow-Leningrad,  1946. 

6.  Leontiev,  A.  N.  The  Problems  of  Mental  Development.  Moscow,  1959. 

7.  Lomkina,  A.  M.  On  the  Physiological  Basis  of  the  Compensation  of  Deranged 

Functions.  Leningrad,  1956. 

8.  Luria,  A.  K.  The  Rehabilitation  of  Brain  Functions  After  War  Trauma.  Moscow, 

1948. 

9.  Paramonova,  N.  P.,  and  E.  N.  Sokolov,  “On  the  Problem  of  the  Reactivity  of  the 

Hearing  Analyzer  in  the  Blind,”  in  Proceedings  of  the  Conference  on  De- 
fectology.  Moscow,  1958. 

10.  Revesz,  G.,  “System  der  optischen  und  haptischen  Raumtauschungen,”  Z.  Psy¬ 

chol.,  Vol.  131  (1934). 

11.  Setchenov,  I.  M.  Selected  Philosophical  and  Psychological  Works.  Moscow, 

1947. 

12.  Shemyakin,  F.  N.,  “Investigation  of  the  Topographical  Images,”  Annual  Reports 

of  the  Academy  of  Pedagogical  Sciences  of  RSFSR,  Vol.  53  (1954). 

13.  Sokolyanski,  I.  A.,  and  A.  I.  Meshchergyakov  (ed.),  “Education  of  the  Deaf- 

and-Blind,”  Annual  Reports  of  the  Academy  of  Pedagogical  Sciences  of 
RSFSR,  Vol.  121  (1962). 

14.  Sverlov,  V.  S.,  “The  Perception  of  Sculpture  by  the  Blind,”  Annual  Reports  of 

the  Academy  of  Pedagogical  Sciences  of  RSFSR,  Vol.  121  (1962). 

15.  Vygotsky.  L.  S.  Selected  Psychological  Works.  Moscow,  1956. 

16.  Yarmolenko,  A.  V.  Essays  on  the  Psychology  of  the  Deaf-and-Blind.  Leningrad, 

1961. 


REVIEW  AND  SUMMARY 


FRANKLIN  S.  COOPER 

Haskins  Laboratories,  Inc.,  New  York,  New  York 


In  these  three  sessions  on  reading  machines  for  the  blind,  we  have  heard 
about  a  very  diverse  set  of  topics.  They  have  ranged  from  extremely 
sophisticated  engineering  to  equally  sophisticated  psychology;  they  have 
dealt  with  a  wide  time  span  in  the  development  of  devices — all  the  way 
from  rigorous  testing  of  devices-in-being  to  speculations  about  how  the 
improved  devices  of  the  future  may  some  day  be  designed. 

Perhaps  you  have  your  own  check  list  of  speakers  and  topics.  This  is 
mine:  Four  papers  dealt  primarily  with  applications  and  devices,  ranging 
from  the  very  specific  devices  discussed  by  Coffey  and  Beddoes  to  the 
applications  of  pneumatic  logic  and  tactile  displays  described  by  Bliss,  and 
the  use  of  “spelled  speech”  demonstrated  by  Metfessel.  Four  more  papers 
were  concerned  with  the  general  principles  of  reading  machines  and  how 
their  outputs  could  be  matched  to  the  receiving  capabilities  of  the  user. 
Clowes  and  Mauch  talked  mostly  about  machines  that  operate  at  the  letter- 
by-letter  level,  while  Selfridge  stressed  the  differences  between  vision  and 
audition  in  processing  patterned  information,  and  Studdert-Kennedy  spe¬ 
cialized  the  general  case  to  a  requirement  for  natural  speech  in  the  output 
of  a  high  performance  reading  machine.  Then  there  were  three  papers  on 
optical  character  recognition.  Rabinow  and  Kazmierczak  summarized  de¬ 
velopments  and  prospects  in  the  United  States  and  in  Europe;  Scharff  urged 
a  coordinated  design  philosophy  in  any  developments  undertaken  in  this  field. 

Such  a  diversity  of  topics  defies  a  straightforward  summary.  I  shall  not 
attempt  the  impossible,  nor  shall  I  impose  on  you  an  inadequate  review  of 
topics  that  our  participants  have  just  presented  clearly  and  at  length.  I 
shall  try,  instead,  to  relate  these  contributions  to  the  broad  problem  of  de¬ 
vising  reading  machines  for  the  blind  and  to  possible  approaches  to  a 
solution. 


THE  READING  MACHINE  PROBLEM 
IN  GENERAL  TERMS 

Let  me  start,  even  though  it  may  seem  unnecessary,  by  restating  the  major 

381 


382  Man-Machine  Systems 

requirements  for  a  reading  machine  for  the  blind.  Briefly,  the  blind  man 
wishes  to  read  a  wide  range  of  the  same  books,  magazines,  and  typewritten 
materials  that  his  sighted  friends  are  reading;  he  would  like  to  have  a 
personal  device,  reasonably  portable  and  inexpensive,  for  use  whenever  he 
wants  it,  as  well  as  access  to  materials  in  libraries;  finally,  he  would  like  the 
device  to  talk  to  him,  for  he  has  no  more  time  than  anyone  else  for  the 
drudgery  of  learning  a  machine  language  and  listening  to  a  letter-by-letter 
rendition  of  the  printed  page. 

There  are  some  inherent  contradictions  in  these  objectives.  A  machine 
that  can  read  rapidly  in  plain  English  is  unlikely  to  be  portable  and  cheap 
enough  for  personal  use;  indeed,  designs  based  on  reasonable  extrapolations 
of  present  technology  are  sure  to  be  large  and  expensive  even  for  use  in 
libraries.  Similarly,  it  is  unlikely  that  the  performance  of  Optophone-like 
devices  can  be  improved  by  an  order  of  magnitude  through  the  discovery 
of  some  ingenious  nonspeech  code.  We  may  be  faced,  at  least  for  the  near 
future,  with  inescapable  compromises  between  the  performance  of  the 
machine  and  its  complexity;  however,  we  dare  not  compromise  with  its 
capability  to  read  ordinary  printed  materials.  These  points  may  appear  ob¬ 
vious  and  I  would  not  have  stressed  them,  except  that  one  still  en¬ 
counters — as,  indeed,  we  have  here — recurring  proposals  for  machines  that 
require  specially  prepared  input  materials  and  for  research  programs  to  re¬ 
explore  coding  possibilities  that  have  been  examined  almost  every  decade 
since  Fournier  d’Albe  invented  the  Optophone  half  a  century  ago. 

So  much  for  the  problem;  what  kinds  of  reading  machines  can  we  en¬ 
vision?  Three  main  lines  of  development  are  indicated  in  Figure  1.  They 
all  start  with  some  means  for  scanning  the  printed  page  and  converting  its 
black  and  white  letter  shapes  into  electrically  coded  signals.  The  diagram 
deals  rather  casually  with  this  process,  though  in  fact  it  is  a  rather  complex 
operation  and  one  that  will  be  expensive  to  automate;  however,  the  develop¬ 
ment  of  suitable  document  handling  and  scanning  devices  should  be  fairly 
straightforward,  so  perhaps  I  may  be  excused  for  devoting  most  of  the 
diagram  to  those  areas  of  signal  processing  where  research  as  well  as  de¬ 
velopment  are  still  required. 

DIRECT  TRANSLATOR  TYPES 

OF  READING  MACHINES 

Of  the  various  types  of  reading  machines,  the  direct  translator  (top  line  of 
Figure  1)  has  received  by  far  the  most  attention.  The  operating  principle 
common  to  a  large  variety  of  such  devices  as  the  Optophone,  Battelle 


Review  and  Summary  383 

aural  reading  device,  and  the  Argyle  is  that  the  shape  of  the  letter  on  the 
printed  page  is  converted  into  an  acoustic  shape  or  pattern  for  presentation 
to  the  listener.  The  acoustic  code  that  results  from  this  conversion  will 
necessarily  be  an  arbitrary  one,  but  it  should  be  possible  for  a  listener  to 
learn  the  rather  limited  number  of  sounds  that  correspond  to  letters  and 
punctuation  marks. 

Such  devices  are  appealing  in  their  simplicity — and  appalling  in  their 
performance.  One  might  not  object  to  learning  an  arbitrary  code  if  he  could 
reasonably  expect  to  become  proficient  in  its  use.  This  has  not  proved 
possible  with  any  of  the  existing  schemes  and  there  are  in  fact  good  reasons 
for  supposing  that  inherent  limitations  (in  human  perception  rather  than  in 
the  devices)  will  continue  to  frustrate  inventors  of  this  class  of  reading 
machine. 

This  seems  a  rather  grim  prospect;  if  one  contemplates  reading  his 
favorite  magazine  at  something  between,  say,  10  and  30  words  per  minute, 
it  is  grim  indeed.  But  for  some  tasks,  such  as  reading  personal  correspond¬ 
ence,  locating  and  identifying  documents,  finding  telephone  numbers,  and 
the  like  the  rate  of  reading  is  very  much  less  important  than  the  ability  to 
read  at  all.  We  owe  a  debt  of  gratitude,  I  believe,  to  Dr.  Eugene  Murphy  for 
realizing  clearly  the  importance  of  this  simple  virtue  of  Optophone-like  de¬ 
vices,  and  for  having  persevered  in  the  effort  to  develop  one  that  would  be 
small  enough  and  cheap  enough  to  be  readily  available  to  the  individual 
when  he  needs  it.  It  is  in  this  light  that  we  should  assess  the  very  considerable 
success  reported  here  in  the  development  and  testing  of  the  Battelle  device. 

READING  MACHINES  THAT  TALK 

Why  not  require  the  reading  machine  to  speak  plain  English — or  whatever 
language  the  user  himself  speaks?  There  would  be  no  problem  then  of 
learning  an  alien  and  cumbersome  acoustic  code  or  of  being  unable  to  read 
rapidly.  This  would  be  an  obvious  and  elegant  solution,  but  also  a  complex 
and  expensive  one.  It  may  be,  nevertheless,  the  answer  we  should  seek. 

Character  recognition  is  one  of  the  necessary  steps  in  making  the  printed 
page  talk,  as  I  have  indicated  in  the  bottom  line  of  Figure  1.  There  are 
several  papers  on  optical  character  recognizers  in  these  Proceedings',  hence, 
I  need  to  say  very  little  about  the  box  labelled  ‘‘Character  Recognizer,” 
except  to  note  that  I  have  introduced  a  somewhat  arbitrary  subdivision  of  its 
internal  operations  because  I  wished  to  have  two  different  kinds  of  outputs 
available  for  the  later  discussion  of  the  center  line  of  the  figure,  involving 
sound  generators.  In  a  logical  sense  at  least  character  recognition  proceeds 


384 


Man-Machine  Systems 


AUDIBLE  OUTPUTS  FOR  READING  MACHINES 


PRINTED 

PAGE 


DIRECT 

TRANSLATOR 

Acoustic  Code 

(,eh,,er^  sound 
shape) 

(e.g.,  Optophone) 

Letter  Properties  or  Features 


CHARACTER 


TRANSDUCER 


Letter 
Identities 


RECOGNIZER 


PROPERTY 

FILTER 

DECISION 

LOGIC 

[letter  (properties 
.  °r 
shape)  features) 

Jetter  -*■  (letter 
properties)  identities) 

SOUND 

GENERATOR 


feature— *-sound 


Acoustic  Code 


Speechlike  Sounds 
Spelled  Speech 


Synthetic  or 
Compiled 


CO 

Q 

Z 

3 

O 

CO 


3 

Q. 

t- 

3 

O 


(Rules/  Stored  Units) 


Figure  1  Audible  Outputs  for  Reading  Machines 


by  the  successive  steps  of  isolating  certain  properties  (features)  of  the  letter 
shapes  and  of  identifying  the  particular  letter  by  looking  for  combinations 
of  these  properties.  The  two  steps  may  not  always  be  distinct,  as  when 
character  recognition  is  carried  out  by  direct  optical  superposition.  For  our 
present  purpose,  however,  we  are  concerned  only  with  the  final  output  of  the 
recognizer  (i.e.,  the  input  to  the  speech  generator),  which  is  a  string  of 
discrete  electrical  signals  that  represent  letter  identities. 

The  most  important  of  all  the  things  we  have  been  told  about  character 
recognizers  is,  I  think,  Rabinow’s  statement  that  we  could  have  by  today’s 
techniques  an  optical  character  recognizer  that  would  meet  the  blind  per¬ 
son’s  need — and  have  it  for  about  $5,000  to  $10,000.  To  be  sure,  Rabinow 
may  be  a  little  optimistic  as  to  price  and  delivery,  but  the  point  remains  that 
it  is  realistic  even  today  to  plan  for  reading  machines  that  include  character 
recognizers. 

The  kind  of  recognizer  that  Rabinow  referred  to  was  not  the  standard 
high  speed  type  designed  for  business  applications,  but  a  much  simpler 
machine  based  on  the  same  technology.  He  implied  that  the  time  is  ripe  for 
those  working  on  aids  for  the  blind  to  proceed  forthwith  to  the  development 
of  simplified  optical  character  recognizers.  It  is  a  tempting  prospect,  espe¬ 
cially  for  some  of  us  who  are  gadgeteers  at  heart,  but  I  wonder  whether  this 
is  the  best  place  to  put  our  effort?  I  think  not,  and  for  several  reasons.  First, 
the  technology  is  still  being  developed  very  rapidly  for  business  purposes 
and  with  business  funds.  Second,  we  are  not  ready  to  use  a  character  recog- 


Review  and  Summary  385 

nizer  and  it  will  be  several  years  before  we  have  workable  answers  to  those 
parts  of  the  problem  that  are  unique  to  reading  machines.  Finally,  there  is  at 
least  a  reasonable  question  about  the  economics  of  simple,  cheap  recog¬ 
nizers  for  reading  machines  versus  large,  expensive  ones  that  can  work 
very  much  faster.  We  often  assume,  as  both  Rabinow  and  Mauch  seem  to 
have  done,  that  a  library  type  of  reading  machine  (including  its  optical 
character  recognizer)  will  serve  the  blind  reader  directly,  i.e.,  that  the  ma¬ 
chine  will  be  obliged  to  work  in  real  time  and  so  could  not  use  (and  need  not 
have)  the  high  speed  capabilities  of  business  machines.  In  fact,  it  may  be 
cheaper  to  use  the  fast  and  expensive  machine  to  prepare  tape  recordings 
that  will  be  used  off-line  by  the  blind  reader.  This  would  of  course  obviate 
the  need  for  any  special  development  of  the  type  that  Rabinow  suggested. 
It  might  be  possible  nevertheless  to  find  a  simple  recognition  scheme  that 
would  be  inherently  suited  to  real-time  operation  and  that  would  deserve 
development  primarily  for  application  in  reading  machines.  I  take  it  that 
this  is  what  Mauch  has  in  mind. 

I  have  not  counseled  against  devoting  effort  to  optical  character 
recognizers  because  they  seem  uninteresting  or  unnecessary — far  from  it — 
but  because  other  problems  are  at  least  as  important  and  are  not  likely 
to  be  solved  with  other  people’s  money. 

Speech  generation  from  input  information  about  the  letters  on  the 
printed  page  is  such  a  problem;  indeed  it  may  well  be  the  key  problem  in 
research  on  reading  machines.  There  are  a  variety  of  ways  in  which  speech 
might  be  produced,  as  is  suggested  by  the  diagram  (Figure  1)  inside  the 
box  labelled  “Speech  Generator.”  The  details  of  this  block  are  shown  in 
Figure  2,  though  even  here  the  entries  are  merely  illustrative  not  exhaustive. 
You  will  notice  that  there  are  two  essentially  different  kinds  of  output 
signal:  either  synthetic  speech  or  speech  compiled  from  recordings  of  bits 
and  pieces  of  actual  utterances.  Both  types  of  output  were  discussed  and 
demonstrated  in  Studdert-Kennedy’s  paper.  His  synthetic  speech  came 
from  a  system  represented  by  the  two  boxes  at  the  right  of  the  upper  line 
of  Figure  2,  i.e.,  those  labelled  “Rules:  phonemes  — »  control  signals”  and 
“synthesizer.”  The  compiled  speech  was  from  a  dictionary  of  recorded 
words  and  so  corresponds  to  the  fourth  (bottom)  level  of  the  diagram. 

It  is  rather  interesting  to  compare  the  trade-off  between  instrumental 
complexity  and  design  sophistication  that  is  implied  by  each  of  the  dif¬ 
ferent  paths  through  this  diagram;  some  of  the  paths  (shown  dotted) 
represent  methods  that  are  possible  but  not,  perhaps,  very  useful.  In  gen¬ 
eral,  the  top  line  implies  the  least  machinery  but  the  most  research;  the 


386 


Man-Machine  Systems 

SPEECH  GENERATORS 


LITERAL  TEXT  LINGUISTIC  UNITS  MACHINE  CODE  OUTPUT 


INPUTS'-  (T)  Letter  Code  (Teletype,  etc.)  or  Output  from  Character  Recog.  Equip. 

(2)  Output  from  Phoneme  Recognizer,  etc. 

(3)  Output  from  Bandwidth  Compression  Equip.,  etc. 

Figure  2  Speech  Generators 


bottom  line,  the  reverse.  The  middle  levels  substitute  increasingly  exten¬ 
sive  dictionaries  for  the  logical  operations  implied  by  “Rules.” 

The  differences  among  paths  are  not  only  those  of  design  sophistication 
versus  hardware,  however,  for  there  are  inherent  differences  also  in  the 
acceptability  of  synthetic  and  compiled  speech.  There  will  not,  I  think,  be 
any  question  of  intelligibility  even  at  rather  high  reading  rates,  but  rather 
one  of  naturalness  and  listening  effort.  Synthetic  speech  can  almost  cer¬ 
tainly  flow  more  smoothly  than  compiled  speech  and  if  it  is  synthesized 
by  rules  (top  level)  there  will  be  no  need  to  interrupt  the  continuous  flow 
of  speech  in  order  to  spell  out  words  that  do  not  appear  in  the  dictionary 
unit.  But  there  are  disadvantages  too:  speech  synthesized  by  rules  will 
have  some  odd  pronunciations  reflecting  the  vagaries  of  English  spelling 
and  the  minor  inadequacies  of  converting  linguistic  units  into  control  signals 
for  a  synthesizer;  it  will  be,  in  short,  machine  speech.  Compiled  speech 
will  sound  quite  human  and  each  word  will  be  pronounced  pleasantly  and 
accurately,  though  the  pronunciation  may  not  quite  fit  the  context  and  there 
may  be  some  rather  odd  intonations.  The  major  difficulty,  however,  will 
be  the  frequent  interruptions  for  spelling.  This  is  a  point  to  which  I  shall 
return  in  a  moment.  The  intermediate  procedures,  involving  dictionary 
searches  for  word-to-phoneme  or  word-to-control-signal  entries,  may  in  fact 


Review  and  Summary  387 

be  the  lines  along  which  an  optimum  solution  is  to  be  found.  It  has  been 
demonstrated,*  for  example,  that  stored  control  signals  can  produce  syn¬ 
thetic  speech  so  similar  to  natural  speech  as  to  be  almost  indistinguishable 
from  it.  Thus,  by  a  suitable  choice  of  intermediate  procedures  we  may 
attain  the  advantages  of  synthetic  speech  with  the  naturalness  of  compiled 
speech,  though  at  a  price  that  combines  the  most  expensive  features  of 
both  procedures,  i.e.,  much  research  and  a  large  memory. 

How  large  must  these  memories  be  and  how  serious  is  the  spelling 
problem?  Clearly  these  are  related  questions,  since  spelling  should  become 
more  and  more  infrequent  as  the  dictionary  becomes  larger  and  larger. 
Some  rough  estimates  may  interest  you.  Assuming  ordinary  magazine  text 
and  a  recorded  vocabulary  of  about  6000  words,  one  will  expect  to  spell 
about  1  word  in  20  (5  percent);  for  a  vocabulary  of  15,000  to  20,000 
words  the  spelling  rate  should  be  down  to  about  1  percent;  the  vocabulary 
required  to  duplicate  a  desk  sized  collegiate  dictionary  is  about  60,000 
words,  or  to  match  Webster’s  Unabridged  about  600,000  words.  The  diffi¬ 
culty,  of  course,  is  that  word  frequencies  lie  on  a  very  long  tailed  distribu¬ 
tion.  The  practical  problem  has  the  further  complication  that  one  does  not 
know  how  much  spelling  the  user  will  tolerate  and  there  is  no  way  to  find 
out  without  running  extensive  tests  with  substantial  amounts  of  compiled 
speech  and  with  sizable  groups  of  blind  users. 

There  are  some  things  one  might  do  to  alleviate  the  situation,  though 
they  complicate  the  machinery.  For  example,  a  great  many  words  occur  in 
pairs  that  differ  only  in  an  added  5  or  es  to  form  the  plural.  It  seems  a  waste 
of  memory  space  to  store  both  forms,  so  one  might  devise  some  way  to  look 
for  the  plural  suffix,  then  use  the  recording  for  the  singular  form  and  add 
a  brief  buzzy  sound.  Another  useful  procedure  may  be  to  use  specialized 
vocabularies  to  supplement  the  general  vocabulary.  Thus,  in  reading  an 
article  on  music,  for  example,  the  specialized  vocabulary  would  contain  the 
names  of  musical  instruments  and  other  terms  which  would  almost  certainly 
not  occur  in  the  first  10  or  20  thousand  words  of  a  general  purpose  vocabu¬ 
lary. 

Let  us  suppose  that  the  use  of  these  special  methods  will  make  it 
possible  to  hold  the  spelling  rate  to  acceptable  levels  with  a  general  vo¬ 
cabulary  of  only  10  to  20  thousand  words.  How  much  machinery  does  this 
imply?  Probably  we  should  consider  digital  storage  not  because  it  is  neces¬ 
sarily  best  suited  to  speech,  but  because  all  the  large  random  access  mem- 

*  Such  a  demonstration  was  given  by  Dr.  Gunnar  Fant  at  the  Sixty-first  Meeting 
of  the  Acoustical  Society  of  America,  Philadelphia,  May  1961. 


388  Man-Machine  Systems 

ories  developed  for  business  purposes  are  digital  devices.  We  can  use  the 
spelling  of  the  word  as  its  address  and  for  practical  purposes  ignore  the 
storage  space  that  this  requires.  If  we  use  pulse  code  modulation  to  record 
the  speech — the  obvious,  direct  way — we  shall  require  several  hundred 
million  bits  of  storage  for  a  10,000-word  vocabulary.  This  implies  a  rather 
big  device;  some  of  the  newer  large  scale  disc  memories  would  just  suffice. 

The  speech  information  could  be  coded  more  compactly,  perhaps  by 
a  factor  of  30  to  50,  if  the  storage  were  in  the  form  of  control  signals  to 
operate  a  synthesizer  (next  line  to  the  bottom  in  Figure  2).  Somewhat  less 
storage  would  be  needed  and  we  might  be  able  to  generate  really  excellent 
speech  if  the  control  signals  were  to  operate  a  synthesizer  that  is  an  analog 
of  the  human  vocal  tract.  Such  a  synthesizer  would  have  the  “physiological” 
constraints  built  into  it  so  that  fewer  external  controls  would  be  needed. 
The  difficulty  is  that  nobody  yet  knows  quite  how  to  generate  these  con¬ 
trol  signals,  though  this  problem  is  being  worked  on  at  both  the  Massachu¬ 
setts  Institute  of  Technology  and  the  Bell  Telephone  Laboratories. 

It  might  even  be  possible,  in  line  with  Studdert-Kennedy’s  comments 
about  the  articulatory  basis  for  speech  perception,  eventually  to  obtain 
still  simpler  control  signals  in  terms  of  “motor  commands”  that  drive  the 
“musculature”  of  a  vocal  tract  analog.  But  this  is  10  to  20  years  away, 
since  we  have  neither  the  knowledge  of  how  to  generate  the  control  signals 
nor  of  how  to  build  the  synthesizer. 

Perhaps  we  should  re-examine  the  possibility  of  storing  the  speech  or 
control  signals  in  analog  form  rather  than  digital.  If  storage  area  (magnetic 
or  optical)  is  taken  as  a  criterion,  then  indeed  analog  storage  has  definite 
advantages.  For  example,  about  the  same  area  of  magnetic  recording  sur¬ 
face  would  be  needed  to  store  a  spoken  vocabulary  by  ordinary  (direct) 
magnetic  recording  as  would  be  needed  for  the  same  vocabulary  in  the 
highly  processed  form  of  digital  control  signals  such  as  we  have  been 
discussing.  It  is  tempting  to  conclude  that  this  would  justify  a  program 
for  the  development  of  a  random  access  memory  especially  suited  to  speech 
storage — a  challenging  problem  but  not,  I  think,  our  proper  business  at 
this  time,  since  we  can  get  by  with  the  digital  equipment  that  is  already 
in  being. 


INTERMEDIATE  SOLUTIONS  TO 
THE  READING  MACHINE  PROBLEM 

We  have  now  considered  the  two  extremes;  let  us  turn  to  possible  devices 
that  are  neither  so  limited  in  performance  as  the  direct  translators  nor  so 


Review  and  Summary  389 

complex  and  expensive  as  the  generators  of  synthetic  or  compiled  speech. 
We  shall  find  an  odd  mixture  of  quick  compromises,  long  term  possibilities, 
and  beckoning  mirages.  The  intermediate  types  of  device  indicated  on  the 
center  line  of  Figure  1  have  the  distinguishing  properties  ( 1 )  that  the 
output  is  not  quite  speech  and  (2)  that  the  input,  although  it  may  require 
something  less  than  full  recognition  of  the  printed  letters,  is  nevertheless 
categorical  and  so  implies  that  some  kind  of  decision  or  recognition  process 
has  been  applied  to  the  noncategorical  information  supplied  by  the  scanner. 
The  hope,  of  course,  is  that  comparatively  simple  operations  can  be  made 
to  yield  an  adequately  useful  output. 

Mauch  has  described  briefly  a  device  that  used  letter  features  to  gen¬ 
erate  sound  units  of  word  length  rather  than  letter  length.  Clearly  this  was 
a  step  in  the  right  direction,  since  one  of  the  difficulties  with  the  acoustic 
code  from  direct  translators  is  that  the  signals  do  not  blend  and  flow  as 
they  do  in  the  words  of  speech.  Might  it  not  be  possible  to  merge  the 
information  about  successive  letters  to  obtain  syllable-like  or  word-like 
acoustic  outputs  and  do  it  without  losing  the  distinctions  between  different 
printed  words  or  incurring  the  complexities  of  generating  natural  speech? 
Some  of  the  features  of  letter  shapes  (projection  above  or  below  the  line, 
curvature,  diagonal  stroke,  and  the  like)  should  be  comparatively  easy  to 
identify  by  machine  and  so  could  provide  a  distinctive  code  for  the  word. 
It  would  then  remain  only  to  generate  distinctive  ‘‘word  sounds.”  Perhaps 
arbitrary  acoustic  codes  of  this  kind  can  be  found  that  will  be  easily  learned 
and  will  also  be  identifiable  at  rapid  reading  rates,  as  Clowes  and  others 
have  suggested.  There  has  been  very  little  work  on  this  possibility  and  one 
can  only  guess  at  the  answer. 

Another  possibility,  much  more  attractive  to  those  of  us  who  have 
worked  on  speech  perception,  is  to  start  with  the  same  information  about 
letter  features,  but  to  generate  acoustic  codes  that  will  be  speech-like  in  the 
sense  that  they  are  readily  pronounceable,  i.e.,  they  could  be  mimicked. 
There  is  good  theory  and  good  evidence  for  supposing  that  such  a  speech¬ 
like  code  would  prove  to  be  quite  efficient  and  would  not  be  very  difficult 
to  learn.  Indeed  it  should  be  less  difficult  than  to  master  a  foreign  language, 
since  the  code  would  at  least  retain  familiar  meanings  and  grammar.  The 
only  difficulty  with  this  otherwise  attractive  possibility  is  that  no  one  has 
yet  seen  how  to  instrument  it  by  means  that  are  markedly  simpler  (either  in 
extracting  the  necessary  letter  properties  or  in  generating  the  speech-like 
sounds)  than  would  be  required  to  go  all  the  way  to  letter  identification 
and  the  generation  of  natural  speech. 


390  Man-Machine  Systems 

Still  another  intermediate  possibility  was  described  for  us  by  Metfessel. 
He  has  used  a  very  rapid  form  of  spelling  as  the  acoustic  output.  This 
should  be  easier  to  implement  than  any  of  the  speech  generators  wc  have 
discussed  thus  far  and.  in  fact,  Mauch  has  developed  a  mechanical  unit 
that  is  ideally  suited  to  the  purpose.  However,  a  “spelled  speech”  device 
would  have  two  severe  disadvantages:  it  would  require  the  full  services  of 
an  optical  character  recognizer  to  provide  its  input,  and  its  output  though 
simply  instrumented  would  be  only  marginally  acceptable.  For  these  rea¬ 
sons  the  method  seems  likely  to  find  application  only  as  an  interim  stage 
in  the  development  of  speech  generating  systems  and  also,  perhaps,  as  an 
inexpensive  way  to  generate  an  audible  output  from  punched  paper  tapes 
that  are  already  available  as  a  by-product  from  the  printing  industry. 

It  might  seem  from  the  discussion  thus  far  that  all  the  possibilities 
shown  on  the  center  line  of  Figure  l  are  for  one  reason  or  another  not 
very  promising.  This  is  not  quite  correct,  because  it  is  only  here  that  we 
can  hope  to  find  the  ultimate  device  which  will  give  good  performance  and 
yet  remain  within  the  size  and  cost  limits  for  a  personal  reading  machine. 
It  does  not  follow,  however,  that  this  is  the  area  in  which  current  research 
should  be  centered.  It  is  quite  likely,  indeed,  that  the  design  sophistication 
needed  to  develop  a  good  intermediate  device  will  be  far  beyond  that  re¬ 
quired  for  a  library  type  of  machine  that  talks  plain  English.  Consider,  as 
a  parallel,  today's  portable  transistor  radio.  It  seems  much  simpler  than 
the  superheterodyne  receiver  of  the  nineteen  twenties  with  its  many  vacuum 
tubes  and  separate  tuning  controls.  The  apparent  simplicity  is  deceptive, 
for  it  really  reflects  far  more  sophistication  in  components  and  design  than 
anything  imagined  in  the  'twenties.  Similarly,  an  intermediate  device  may 
someday  be  built  using  cheap  and  apparently  simple  hardware  to  make  a 
feature  analysis  and  produce  speech-like  sounds  all  in  a  very  compact  little 
machine.  But  we  will  not  know  how  to  design  this  kind  of  machine  for 
some  years  yet. 

IN  CONCLUSION 

Perhaps  I  can  summarize  these  remarks  and  some  of  the  points  made  in 
the  earlier  papers  under  a  pair  of  topics:  the  requirements  that  a  reading 
machine  should  meet  and  practical  objectives  for  current  research. 

As  to  requirements  we  can  say  flatly  that  a  reading  machine  must  read 
ordinary  type.  Ideally  it  should  also  be  cheap  and  portable,  permit  fast 
reading,  and  talk  plain  English  or  something  that  is  very  easy  to  learn. 

In  a  practical  sense  not  all  these  objectives  can  be  met  and  we  must  be 


Review  and  Summary  39 1 

prepared  to  accept  compromise  solutions  for  quite  some  time.  One  of 
these  compromise  solutions  will  be  a  personal  type  of  device  (such  as  the 
Battelle  aural  reading  device)  that  leaves  much  to  be  desired  in  per¬ 
formance,  but  is  nevertheless  usable  where  reading  speed  is  not  a  major 
consideration.  Another  will  be  that  of  a  reading  machine  for  use  in  libraries 
or  centers  where  tapes  are  recorded  for  distribution.  This  will  inevitably 
be  a  complex  and  expensive  machine,  but  it  should  produce  natural  speech 
of  quite  acceptable  quality.  In  addition,  there  may  be  intermediate  devices 
of  one  kind  or  another  that  will  fill  special  needs,  though  they  are  not 
likely  to  be  both  simple  enough  for  personal  use  and  effective  enough  for 
general  use;  however,  the  reading  machine  of  the  future  may  well  be  an 
intermediate  type  of  device,  but  one  that  attains  its  simplicity  of  hardware 
by  a  combination  of  sophisticated  design  and  the  use  of  a  speech-like  output. 

Clearly  there  is  much  research  to  be  done — so  much  that  it  behooves 
us  to  sift  out  the  real  problems  from  the  pseudoproblems  and  even  then  to 
give  careful  attention  to  priorities. 

Among  the  real  problems  is  the  practical  one  of  getting  a  good  direct 
translator  into  use.  This  means  choosing  one  specific  device,  building  a 
number  of  models  for  field  trials,  and  working  out  training  methods.  The 
Battelle  aural  reading  device  is  the  logical  choice,  for  it  is  good  enough 
in  an  absolute  sense  and  far  closer  to  practical  realization  than  any  other 
competitive  device. 

Another  research  area  important  for  the  development  of  a  high  per¬ 
formance  library  type  of  device  is  the  output  problem,  i.e.,  how  to  gen¬ 
erate  acceptable  speech.  We  do  not,  I  think,  need  to  be  much  concerned 
just  now  with  the  development  of  optical  character  recognizers  and  large 
random  access  memories,  though  they  are  essential  components.  It  seems 
fair  to  assume  that  by  the  time  the  output  problem  is  solved — that  is,  in 
another  two  or  three  years — such  devices  will  be  available,  having  been 
developed  primarily  for  business  applications  of  electronic  data  processing. 
The  marriage  of  these  devices  with  a  speech  generator  to  provide  a  high- 
performance,  library  type  of  reading  machine  should  be  a  fairly  simple 
ceremony. 

A  third  area  on  which  research  ought  now  to  be  proceeding  is  in  the 
development  of  tactile  devices  or  methods  for  generating  tactile  text.  This 
aspect  of  the  reading  machine  problem — inescapable  for  text  that  contains 
diagrams,  maps,  or  formulae — has  had  less  attention  here  than  it  deserves. 

Finally,  there  will  be  a  gap  in  the  development  schedule  between  the 
Battelle  type  of  device  and  the  high  performance  reading  machine  for  li- 


392  Man-Machine  Systems 

brary  use.  Anything  that  can  be  done  to  provide  intermediate  devices  for 
the  interim  period  should  be  carefully  considered. 

Are  there  areas  of  low  priority?  I  realize  that  this  is  a  delicate  question, 
but  when  resources  are  so  limited  it  is  all  the  more  important  to  avoid 
unprofitable  enterprises.  One  of  these  would  seem  to  be  the  effort  to  evolve 
a  better  output  for  direct  translation  devices.  I  would  even  label  this  a 
pseudoproblem  on  two  counts:  first,  there  is  a  great  deal  of  evidence  that 
improvements  (if  any)  will  be  marginal;  second,  if  a  device  such  as  the 
Battelle  machine  is  carried  to  the  point  that  people  can  and  do  use  it,  then 
it  makes  very  little  sense  to  duplicate  the  development  costs  just  to  produce 
a  different  device  of  about  the  same  capabilities. 

Another  problem  to  which  I  would  assign  a  rather  low  priority  is  the 
engineering  development  of  optical  character  recognizers  or  large  random 
access  memories.  These  are  not  unimportant  problems,  but  they  will  be 
solved  by  other  people  for  other  purposes  by  the  time  we  need  the  devices 
for  reading  machine  applications. 

I  should  like  to  end  these  remarks  with  a  brief  reminiscence  and  a 
gratuitous  prediction.  In  the  mid-nineteen  forties  the  Committee  on  Sensory 
Devices  sponsored  some  research  on  reading  machines  for  the  blind.  It 
did  a  careful  survey  of  the  prior  art  and  listened  attentively  to  suggestions 
for  new  developments.  Near  the  end  of  its  active  life  the  Committee  con¬ 
sidered  a  project  for  the  construction  of  a  device  to  recognize  typewritten 
characters  and  generate  the  corresponding  letter  sounds.  This  idea,  even 
though  proposed  by  a  world  famous  scientist  high  in  one  of  our  largest 
electronics  companies,  seemed  almost  too  visionary  to  warrant  exploration. 
Today  the  idea  seems  amusingly  archaic. 

As  a  prediction,  let  me  say  that  we  will  have,  within  five  years  or  so, 
at  least  one  major  library  center  in  which  a  reading  machine  is  being  used 
to  prepare  tape  recordings  for  blind  users;  further,  that  the  machine  will 
be  largely  automatic,  will  provide  quite  acceptable  speech  at  normal-to- 
rapid  reading  rates,  and  will  be  so  fast  that  the  cost  per  tape  will  be  quite 
modest.  I  hope  none  of  you  will  accuse  me  of  merely  betting  on  a  sure 
thing,  even  though  I  probably  am. 


SECTION  III 


INDIRECT  ACCESS 
TO  THE  PRINTED  PAGE 

CHAIRMAN:  EDWARD  L.  GLASER 
Burroughs  Corporation,  Paoli,  Pennsylvania 


AUTOMATIC  MACHINE  TRANSLATION: 
POTENTIALITIES  FOR  BRAILLE  ENCODING 

VICTOR  H.  YNGVE 

Massachusetts  Institute  of  Technology ,  Cambridge,  Massachusetts 


The  automatic  transcription  of  contracted  braille  from  uncontracted  ma¬ 
terial  shares  many  of  the  problems  of  the  translation  by  machine  of  such 
languages  as  German  and  Russian.  To  the  extent  that  the  braille  spelling 
rules  refer  to  the  conventional  spelling  of  the  original  the  problems  are 
minor.  But  to  the  extent  that  the  braille  spelling  rules  refer  to  pronuncia¬ 
tion,  grammatical  function,  or  meaning  the  problems  are  severe  and  can 
be  attacked  only  with  very  sophisticated  methods.  In  other  words,  going 
from  a  code  (ordinary  inkprint  spelling)  into  braille  is  really  a  problem  in 
translation. 

There  are  available  source  documents  in  the  form  of  punched  paper 
tape  that  are  by-products  of  the  printing  industry.  It  has  occurred  to  many 
of  us  that  these  could  perhaps  be  used  to  produce  braille  copies  auto¬ 
matically.  There  is  also  the  possibility  of  providing,  from  a  typewriter 
keyboard,  pulses  corresponding  to  ordinary  spelling,  and  then  translating 
these  into  the  correct  contractions  in  braille.  There  are  various  other  pos¬ 
sible  ways  of  tying  the  two  systems  of  representation  together. 

The  problem  is  made  difficult  by  the  nature  of  braille.  Braille  is  in  a 


393 


394  Man-Machine  Systems 

sense  based  on  spelling,  but  many  of  its  rules  refer  not  to  the  spelled 
form  but  to  the  underlying  language.  In  other  words,  in  a  very  real  sense 
braille  is  a  direct  representation  of  the  spoken  language  rather  than  a  direct 
representation  of  the  spelled  form  that  we  find  in  books. 

In  order  to  make  the  rules  of  braille  easier  many  of  the  rules  of  ordi¬ 
nary  spelling  have  been  adopted,  so  that  many  words  in  braille  are  spelled 
exactly  the  same  way  as  they  are  in  a  book.  Since  the  purpose  of  braille 
is  still  to  transmit  information,  however,  I  think  it  proper  that  the  rules 
have  been  stated  in  terms  of  the  underlying  language.  This  is  true  as  long 
as  a  human  being  transcribes  the  braille,  because  he  can  understand  the 
language  that  is  being  encoded  into  braille  and  can  very  easily  use  rules 
couched  in  terms  of  the  underlying  language.  However,  given  the  problem 
of  translating  into  braille  from  a  representation  equivalent  to  inkprint,  or 
from  the  output  of  a  typewriter  keyboard,  one  faces  a  different  problem. 
First  of  all,  the  spelling  system  of  English  is  notoriously  poor. 

I  think  I  can  illustrate  the  sort  of  problems  one  faces  by  stating  a 
couple  of  rules  from  the  braille  standard  of  some  years  ago.  Rule  34  for 
Grade  2  braille  has  to  do  with  contractions;  it  says,  “Contractions  forming 
parts  of  words  should  not  be  used  when  they  are  likely  to  lead  to  obscurity 
in  recognition  or  pronunciation,  and  therefore  they  should  not  overlap  well- 
defined  syllable  divisions.”  This  rule  is  stated  in  terms  of  syllable  divisions, 
something  that  is  not  explicitly  represented  in  inkprint.  “Word  signs  should 
be  used  sparingly  in  the  middle  of  words  unless  they  form  distinct  syllables. 
.  .  .  Special  care  should  be  taken  to  avoid  undue  contractions  of  words  of 
relatively  infrequent  occurrence.”  It  goes  on,  “.  .  .  when  words  occur  at 
the  end  of  a  line,  they  must  be  at  the  end  of  a  syllable.”  Here  we  have  a 
rule  for  contracted  braille  stated  not  in  terms  of  the  inkprint  spelling,  but 
in  terms  of  syllables,  which  are  a  feature  of  the  underlying  language.  The 
question  arises:  Can  one  syllabify  a  word  automatically?  I  think  most  of 
you  know  that  this  is  very  difficult.  I  have  to  look  words  up  in  the  dictionary 
in  order  to  separate  them  correctly  at  the  end  of  a  line.  The  difficulty  is 
partly  due  to  the  traditional  spelling  of  English,  a  heritage  from  past  eras, 
and  does  not  in  many  cases  conform  exactly  to  the  pronunciation. 

The  next  example  is  contained  in  Rule  23.  According  to  this  rule  the 
contractions  “to,”  “into,”  and  “by”  are  always  to  be  written  close  up  to 
the  word  or  that  word  which  follows.  It  goes  on,  “.  .  .  in  such  phrases  as 
‘it  was  referred  to  yesterday,’  and  ‘he  was  passed  by  when  others  were 
noticed,’  the  ‘to’  and  the  ‘by’  should  be  written  in  full  and  not  contracted, 
as  they  refer  to  the  preceding  verb  and  not  to  the  word  that  follows  them.” 


A  utomatic  Machine  T ranslation 


395 


In  other  words,  if  “to”  or  “by”  are  prepositions,  as  in  “to  the  house,”  or 
“by  the  table,”  then  one  would  contract  the  “to”  or  the  “by”  according  to 
this  rule,  and  write  the  contracted  form  without  a  space  immediately  pre¬ 
ceding  the  next  word.  However,  if  “to”  and  “by”  are  adverbs,  as  in  the 
case  “.  .  .  it  was  referred  to  yesterday  .  .  .”  and  “  ...  he  was  passed  by 
when  others  were  noticed,”  they  are  not  contracted.  There  is  no  clue  in  the 
inkprint  that  these  words  are  in  one  case  prepositions  and  in  the  other  case 
adverbs;  they  are  not  marked  explicitly.  One  needs  to  have  an  understand¬ 
ing  of  the  sentence  in  order  to  make  that  distinction,  or  else  one  has  to 
have  a  method  of  grammatically  parsing  the  sentence  so  that  he  can  de¬ 
termine  whether  these  words  are  prepositions  or  adverbs.  This  is  a 
problem  that  has  been  faced  in  the  mechanical  translation  of  languages, 
and  I  shall  say  a  little  bit  about  it  below. 

The  following  sentence  is  actually  syntactically  ambiguous:  “It  was 
referred  to  the  other  day”  (or,  “It  was  referred,  to  the  other  day”).  The 
first  inkprint  makes  no  distinction;  one  can  say  it  either  way,  however, 
using  a  different  tone  of  voice.  The  “to”  in  this  sentence  (as  I  read  the  rules 
of  braille)  would  in  one  case  be  contracted  and  written  next  to  the  word; 
in  the  other  case  it  would  not  be  contracted.  The  resolution  of  such 
ambiguities  is  relatively  easy  for  the  person  who  reads  the  material,  if  he 
understands  it.  The  resolution  of  such  ambiguities  would  be  very  difficult, 
however,  for  a  machine.  It  is  also  the  sort  of  ambiguity  resolution  that  the 
people  working  in  mechanical  translation  of  languages  have  been  facing. 

I  shall  give  a  brief  summary  of  this  work.  The  first  hope  was  that 
one  could  put  a  dictionary  into  a  computer.  The  computer  would  simply 
look  up  the  words  one  at  a  time  in  the  dictionary,  finding  equivalent  words 
in  the  other  language,  and  print  them  out.  Such  a  dictionary  would  be 
easy  to  mechanize;  the  problems  were  involved  primarily  in  the  large 
size  of  the  dictionary  as  compared  with  the  relatively  small  size  of  mem¬ 
ories.  The  most  promising  method  of  implementing  such  a  thing  would 
be  to  put  the  dictionary  on  magnetic  tape  available  to  the  computer  and 
arrange  to  look  up  the  words  in  batches.  I  say  this  was  a  hope;  there  were 
many  problems  with  it.  First  of  all,  especially  in  languages  such  as  Russian 
which  was  given  a  lot  of  attention  by  people  working  in  mechanical  trans¬ 
lation,  it  was  realized  very  quickly  that  the  size  of  the  dictionary  could  be 
greatly  reduced  by  not  storing  a  whole  word  complete  with  its  ending,  but 
storing  instead  its  stem  separate  from  the  ending.  Then  a  program  in  the 
machine  would  take  each  Russian  word,  examine  it  letter  by  letter,  split  off 
any  inflectional  endings  there  might  have  been  (e.g.,  case  endings,  verb 


396  Man-Machine  Systems 

endings),  and  look  up  the  remainder  in  the  dictionary.  Then,  having  found 
the  stem  of  the  word  in  the  dictionary,  the  machine  could  go  ahead  and 
interpret  the  remainder  of  the  word  as  an  inflectional  ending  and  give  it 
its  appropriate  meaning.  Programs  of  this  type  have  been  written  at  a 
number  of  universities  and  at  a  number  of  industrial  firms  that  have  been 
working  on  this  type  of  translation.  I  can  report  that  the  problem  is  effec¬ 
tively  solved. 

However,  our  hopes  were  really  too  high.  The  result  of  writing  out 
just  the  words  from  such  a  dictionary  look-up  process  was  completely 
inadequate  as  a  translation  (and  I  mean  completely  inadequate).  There 
are  two  main  reasons  for  this.  One  was  that  if  one  looks  up  almost  any 
word  in  a  dictionary,  one  finds  that  it  has  several  renditions  in  the  other 
language.  The  other  reason  was  that  even  if  one  could  select  the  correct 
meaning  of  each  of  the  input  words  and  string  these  meanings  together 
the  word  order  would  be  wrong  and  in  general  a  grammatically  correct 
sentence  is  not  obtained.  In  some  cases  the  “translation”  is  so  badly  garbled 
that  one  cannot  make  any  sense  out  of  it  even  if  the  correct  word  is  there. 
The  problem  was,  what  next  to  do? 

The  next  step  was  to  look  at  the  output  and  see  whether  something 
more  could  be  done.  Certain  rules  were  set  up,  ad  hoc  rules,  which  worked 
perhaps  80  percent  of  the  time.  Let  me  illustrate  such  a  rule.  The  letter 
sequence  d-e-r  in  German  can  be  an  article  in  front  of  a  noun;  it  can  be 
a  relative  pronoun;  if  it  is  an  article  it  can  be  nominative,  genitive,  or 
dative.  The  translation  of  this  three  letter  word  would  depend  on  its  gram¬ 
matical  function.  Thus,  a  very  simple  rule  of  thumb  is:  “If  d-e-r  follows 
a  noun  without  a  comma  translate  it  ‘of  the.’  ”  This  rule  will  give  the 
correct  answer  about  90  percent  of  the  time;  perhaps  even  95  percent  of 
the  time.  It  is  wrong  when  “der”  is  dative  (and  it  could  very  easily  be 
dative),  but  it  is  dative  perhaps  only  5  percent  of  the  time.  Thus  the  trans¬ 
lation  is  wrong  about  5  percent  of  the  time.  It  sounds  impressive,  however, 
to  have  a  rule  that  works  95  percent  of  the  time.  This  is  the  type  of  rule 
that  I  call  an  “ad  hoc  rule.”  The  rule  is  not  really  based  on  the  structure 
of  the  language.  In  other  words  the  case  is  not  determined  and  the  part 
of  speech  is  not  determined. 

Many  of  the  mechanical  translation  groups  looked  for  and  discovered 
a  large  number  of  such  rules  of  thumb;  they  were  able  to  make  a  fairly 
reasonable  improvement  in  readability.  Another  example  they  found  was 
this:  “If  there  are  three  meanings  for  a  word,  and  one  is  very  frequent 
while  the  other  two  are  not  as  frequent,  then  print  the  frequent  meaning 


Automatic  Machine  Translation 


397 


and  forget  about  the  others.”  Again  the  quality  is  improved  because  it  is 
very  difficult  for  the  reader  to  be  faced  with  three  alternatives;  he  can 
read  much  more  easily  if  he  has  only  single  words  to  consider.  Choosing 
the  most  frequent  meaning  is  more  often  right  than  not.  This  is  also  the 
kind  of  rule  that  is  not  really  a  “correct  rule.”  On  the  average,  however, 
it  will  work. 

Mechanical  translation  people  were  quite  optimistic  about  this  pro¬ 
cedure;  they  thought,  .  .  it’s  just  a  matter  of  finding  more  and  more  of 
these  rules;  fixing  up  the  order;  eliminating  more  and  more  of  the  prob¬ 
lems.”  Unfortunately,  one  can’t  go  all  the  way  with  this  approach.  It  be¬ 
comes  much  too  complicated,  rules  conflict  with  rules,  and  one  never  really 
knows  what  one  has  when  it  is  done.  I  want  to  emphasize,  however,  that 
such  rules  will  take  care  of  perhaps  80  percent  of  the  problems  involved. 
This  first  80  percent  of  the  problem  is  easy  to  solve.  It  is  the  remaining  20 
percent  that  is  extremely  difficult  and  that  cannot  be  solved  by  such  rules 
of  thumb. 

The  next  approach  was  to  try  and  do  it  right:  to  find  out  what  are  the 
actual  parts  of  speech  of  each  word  in  the  sentence.  Let  us  try  to  find  out 
whether  it  is  a  preposition  or  an  adverb.  Let  us  find  out  what  is  the  subject 
of  the  sentence;  what  is  the  verb;  what  is  the  object;  and  so  on.  In  other 
words,  do  a  complete  parsing  of  the  sentence.  Programs  of  this  sort  have 
been  written  and  they  are  fairly  successful,  but  a  new  batch  of  problems  has 
shown  up. 

In  general  it  is  not  possible  to  parse  a  sentence  without  knowing  its 
meaning.  This  we  found  through  experience.  I  suppose  if  we  had  thought 
about  it  we  would  have  known,  but  we  hoped  that  a  simple  parsing  of  the 
sentence  would  give  us  enough  improvement  in  the  output  of  the  translating 
program  to  be  useful.  Take  the  sentence  we  used  above:  “it  was  referred 
to  the  other  day,”  or,  as  it  may  be  read,  “it  was  referred,  to  the  other  day.” 
It  would  appear  that  parsing  would  help.  However,  this  sentence  is  am¬ 
biguous;  it  has  two  different  parsings.  In  any  given  text  this  sentence  would 
be  unambiguous  because  of  its  context,  because  the  person  who  reads  it 
would  understand  what  was  meant  very  readily,  and  it  would  never  enter 
his  mind  that  the  sentence  was  ambiguous.  Unless  he  can  understand  the 
text  too,  he  cannot  do  this.  So  the  limitations  on  automatic  parsing  of 
sentences  is  just  at  that  point  where  we  need  to  understand  the  meaning  of 
the  sentence  in  order  to  resolve  ambiguities.  I  can  report  to  you  that  such 
ambiguities  are  a  very  frequent  occurrence.  A  very  large  number  of 
sentences  are  really  ambiguous  from  this  grammatical  point  of  view.  We 


398  Man-Machine  Systems 

are  not  bothered  by  such  ambiguity  when  we  read  because  we  understand 
the  meaning  and  it  is  this  understanding  of  the  meaning  of  the  sentence 
that  carries  us  through  the  ambiguities. 

Our  hopes  have  been  dashed  again.  The  essential  limit  of  a  program 
for  parsing  a  sentence  is  just  in  this  area  which  I  like  to  call  “semantics.” 
A  number  of  the  groups  working  on  mechanical  translation  are  now  facing 
up  to  the  problem  of  semantics.  This  problem  appears  to  be  orders  of  mag¬ 
nitude  more  difficult  than  the  syntactic  problem.  We  have  a  few  hunches, 
but  I  don’t  think  we  have  the  foggiest  idea,  really,  of  how  to  solve  this 
problem.  Nevertheless,  most  of  the  groups  are  working  at  it.  They  are  trying 
ad  hoc  rules,  and  they  are  trying  various  other  schemes.  They  have  also 
tried  schemes  such  as  the  following. 

You  all  know  that  you  can  row  a  boat.  Now,  it  turns  out  that  there  aren’t 
very  many  other  things  that  you  row,  other  than  boats.  The  word  r-o-w  is 
ambiguous:  it  could  be  a  row  (a  brawl)  or  a  row  (of  objects).  In  other 
words  the  meaning  of  this  word  or  the  solution  of  this  ambiguity  can  be 
found  partially  but  not  completely.  In  the  general  case  one  must  also  take 
care  of  the  meaning  of  the  sentence.  One  way  of  doing  this  is  to  list  in  the 
dictionary  that  it  is  boats  that  you  row  and  not  other  things.  Much  informa¬ 
tion  of  this  kind  in  the  dictionary  might  be  quite  useful  in  resolving  ambi¬ 
guities.  There  are  other  methods  that  have  been  proposed.  One  is  to  order 
the  words  in  the  dictionary  in  much  the  same  way  as  they  are  in  a  thesaurus, 
by  meaning  categories  with  indexes  and  connections  between  words,  and 
putting  them  into  fields  of  knowledge  and  fields  of  interest  much  the  same 
way  Roget  did  in  his  Thesaurus.  There  are  several  other  such  schemes.  In 
other  words,  we  are  taking  the  first  faltering  steps  into  the  area  of  semantics. 

Now,  as  to  braille,  I  think  that  the  complete  and  correct  transcription 
of  contracted  braille,  according  to  the  currently  accepted  official  rules  of 
standard  English  braille,  is  not  currently  feasible.  I  want  to  be  very  clear 
about  this;  it  is  exactly  what  I  mean.  I  say,  “It’s  not  feasible,”  but  on  the 
other  hand  it  is.  Attend  very  carefully  to  the  qualification:  it  is  more  than 
“not  feasible”  vs.  “feasible.” 

Automatic  transcription  is  feasible  if  certain  of  the  rules  are  compro¬ 
mised.  The  real  question  is,  what  is  the  degree  of  compromise  that  is  neces¬ 
sary?  I  suggest  that  we  work  out  the  best  compromises  and  standardize  them 
into  a  new  type  of  braille  specifically  for  machine  transcription.  All  the  rules 
should  be  phrased  in  terms  of  the  conventional  spelling  of  the  original  text 
with  no  reference  to  pronunciation,  grammatical  function,  or  meaning. 
This  “machine  transcription  braille”  should  conform  as  closely  as  possible 
to  the  current  practice  so  that  it  could  be  read  interchangeably  with  hand 


Automatic  Machine  Translation 


399 


transcribed  braille.  Now  this  is  precisely  what  is  being  done,  except  that  we 
have  not  standardized  our  usage.  The  braille  programs  that  we  have  now  do 
operate  with  rules  stated  in  terms  of  the  traditional  spelling.  In  other  words, 
a  pronunciation  rule  would  be  restated:  “you  will  do  such-and-such,”  in¬ 
stead  of  saying,  “you  do  such-and-such  except  when  you  would  pronounce 
it  some  other  way”  (you  do  such-and-such  and  list  the  exceptions).  This  is 
tantamount  to  restating  the  rule  in  terms  of  the  inkprint  spelling. 

If  we  devise  machine  programs  that  actually  are  used  for  transcribing 
braille,  a  little  thought  should  be  given  to  stating  these  rules  the  way  we 
really  want  to  use  them,  while  realizing  that  machine  programs  can  be  very 
easily  changed  to  conform  with  any  set  of  braille  rules  one  might  wish  to 
use.  I  think  it  would  behoove  the  people  who  are  interested  in  what  the 
machine  produced  braille  is  going  to  look  like  to  look  at  the  rules  as  they  are 
stated  now,  and  to  the  problem  of  restating  these  rules  in  some  way  so  that 
machine  programs  can  be  written  that  will  give  the  kind  of  braille  they  want. 
They  must  realize  that  it  is  impossible  to  program  a  machine  to  transcribe 
braille  according  to  the  rules  as  they  now  stand,  because  the  criteria  now  put 
down  have  to  do  with  the  pronunciation,  with  the  grammatical  structure,  or 
with  the  meaning  of  a  sentence.  These  are  problems  that  have  not  been 
solved  even  in  the  mechanical  translation  of  languages.  They  are  in  fact  ex¬ 
tremely  difficult  problems. 

I  have  one  other  comment.  It  would  be  a  good  idea  to  capitalize  upon 
the  rather  wide  availability  of  punched  tape  from  the  printing  industry.  I 
imagine  that  this  has  been  suggested  before.  I  feel  that  this  material  should 
be  placed  in  a  central  repository  so  that  people  who  want  to  make  braille 
editions  would  have  it  available.  There  are  other  groups  that  are  also  inter¬ 
ested  in  a  centralized  repository  for  this  material.  I  would  think  it  would  be 
very  wise  to  contact  these  groups  and  work  with  them.  The  other  groups  are 
primarily  concerned  with  mechanical  translation  (who  would  like  to  have 
the  material  for  translation)  and  the  groups  associated  with  information 
retrieval  (or  the  automatic  library).  I  don’t  know  where  the  best  place 
would  be  for  such  a  center;  possibly  the  Library  of  Congress.  Publishers 
send  copies  there  anyway  for  copyright  purposes.  Perhaps  they  wouldn’t 
mind  sending  their  punched  paper  tapes  to  the  Library  of  Congress.  I  don’t 
know;  I  presume  that  the  Library  of  Congress  is  not  set  up  for  this  kind  of 
thing,  that  there  would  have  to  be  something  added.  Perhaps  it  is  unsatis¬ 
factory  as  a  repository  for  other  reasons;  but  I  certainly  feel  that  this  should 
be  explored,  and  it  should  be  explored  concurrently  with  other  groups  that 
are  also  interested,  particularly  the  information  retrieval  people. 

In  concluding,  I  should  like  to  consider  a  number  of  specific  questions 


400  Man-Machine  Systems 

having  to  do  with  problems  in  applying  the  caveats  1  have  discussed.* 
Among  these  1  would  include  those  dealing  with  ( 1)  paper  tapes,  (2)  con¬ 
tractions  vs.  syllable  boundaries,  (3)  the  anticipated  or  possible  contraction 
in  a  revised  and  “computer-oriented"  braille,  and  (4)  the  argument  for 
complex  translation  programs  versus  the  generation  of  a  modified  braille. 

THE  PROBLEM  OF  OBTAINING 
PUNCHED  PAPER  TAPES 

This  is  an  extremely  difficult  problem.  I  personally  feel  that  the  best  solu¬ 
tion.  which  perhaps  is  not  feasible,  would  be  to  update  the  procedures  in  the 
printing  industry.  After  all,  the  printing  industry  was  mechanized  about  50 
years  ago,  when  Monotype  was  really  the  last  word  in  automation.  It  uses 
a  player  piano-like  roll  which  is  not  quite  as  wide  as  that  for  the  player 
piano.  It  is  read  the  same  way,  with  compressed  air;  it  huffs  and  puffs  and 
chugs  along,  and  is  not  really  in  line  with  modern  automation  techniques. 
Monotype  has  served  the  printing  industry  admirably;  I  could  imagine  they 
would  be  loathe  to  change  unless  a  very  real  advancement  were  achieved. 
There  are  people  who  are  thinking  in  terms  of  computer  programs  to  help 
correct  errors.  In  fact  there  are  some  at  MIT  who  are  doing  this  sort  of  thing. 
If  one  can  get  the  material  that  is  to  be  published  on  tape  and  into  a  com¬ 
puter,  then  it  is  possible  to  write  a  program  that  will  correct  this  tape  to 
order.  This  has  a  very  great  advantage,  namely  that  one  does  not  have  to 
proofread  the  material  carefully  once  more.  Once  it  is  set,  once  it  has  been 
proofread,  once  it  is  correct — it  is  there,  it  is  done.  I  feel  that  there  is  a  great 
deal  of  room  for  advancement  in  this  area. 

CONTRACTIONS,  SYLLABLE  BOUNDARIES, 

AND  THE  COMPRESSION  OF  COMPUTER- 
ORIENTED  BRAILLE 

I  think  that  there  is  no  doubt  that  contractions  across  a  syllable  boundary 
could  tend  to  slow  a  person  up.  I  think  the  problem  here  is  to  state  the  rules 
in  such  a  way  that  a  machine  can  follow  them;  in  other  words,  to  state 
mechanical  rules  that  will  give  braille  that  is  readable.  Probably  a  statistical 
approach  here  would  do.  If  there  is  only  one  word  out  of  ten  pages  that  is 
going  to  slow  up  a  reader,  then  it  is  not  going  to  slow  him  very  much  over¬ 
all.  If  one  can  state  the  rules  in  such  a  way  that  they  do  the  right  things 

*  The  material  in  this  section  was  prepared  from  the  question  and  answer  period 
following  Dr.  Yngve's  paper — Ed. 


Automatic  Machine  Translation 


401 


effectively  most  of  the  time,  then  if  a  mistake  is  made  once  in  ten  pages,  or 
a  contraction  is  made  across  a  syllable  boundary,  the  risk  is  worth  taking. 
Let  us  make  a  standard  to  do  the  contracting  so  that  different  people  who 
have  different  programs  can  still  produce  the  same  braille.  1  think  that  once 
the  reader  is  used  to  the  results  they  might  not  slow  him  up  very  much. 

My  guess  is  that  the  degree  of  compression  in  braille  would  not  be 
changed  appreciably.  This  is  only  an  impression,  for  1  have  not  made  a 
study  of  this.  The  size  of  a  braille  book  is  not  likely  to  be  increased  by  very 
much;  perhaps  by  one  page  out  of  100,  or  something  like  that. 

There  are  two  problems  here.  One  is  the  physical  problem  of  storing 
all  the  words.  One  would  have  for  example  the  word  “hothouse,”  in  which 
the  “th”  should  not  be  contracted,  presumably.  There  are  several  ways  of 
approaching  this  problem  in  a  machine.  One  is  to  list  all  these  words.  This 
means  merely  looking  up  the  word  in  the  dictionary,  seeing  what  list  it  is  in, 
perhaps  storing  only  one  list,  the  smaller  one.  If  the  word  is  not  in  that  list, 
then  do  the  job  the  longer  way.  There  is  one  problem  here  in  that  the  list 
might  require  a  fairly  large  storage.  One  way  out  is  to  store  only  the  words 
one  expects  to  run  into  frequently  and  not  the  others.  Then  the  rule  would 
be  correctly  followed  most  of  the  time.  Another  approach  would  be  to  look 
at  the  spelling  and  to  make  such  rules  as  “  ‘th’  after  ‘s’  should  be  contracted” 
(assuming  that  we  find  that  this  is  generally  the  case),  and  for  “hothouse,” 
the  “th”  after  a  vowel  perhaps  would  not  be  separated.  In  other  words,  it 
might  be  possible  to  state  the  rules  in  terms  of  spelling  and  yet  have  a 
fairly  satisfactory  result.  Whichever  way  it  is  done,  there  is  not  too  much 
difference  from  a  machine  point  of  view,  except  it  is  out  of  the  question  to 
list  all  of  the  words  involved  in  some  of  these  rules. 

The  other  problem  is  that  there  are  many  of  these  words;  with  vocabu¬ 
lary  one  is  dealing  essentially  with  an  open  class.  People  can  invent  new 
words,  for  example,  and  when  they  have  invented  new  words  one  wants  the 
program  to  deal  with  them  correctly.  It  is  not  feasible,  however,  to  list 
words  that  haven’t  been  invented  or  used. 

LARGE  COMPUTERS  VS.  MODIFIED  BRAILLE 

First  of  all,  I  agree  100  percent  with  the  statement  made  here  that  the  ma¬ 
chine  should  serve  man  and  not  vice  versa;  this  is  in  part  my  own  motivation 
in  working  towards  mechanical  translation.  Communication  between  differ¬ 
ent  linguistic  communities  now  goes  entirely  through  people  who  are  to 
some  extent  bilingual.  If  we  had  some  machine  aid  in  this  area  we  could, 
I  think,  do  something  by  machine  which  is  quite  a  burden  to  people.  I  don't 


402  Man-Machine  Systems 

want  to  be  misunderstood  on  this  point:  from  one  point  of  view,  one  must 
change  the  rules  of  braille  if  the  job  is  going  to  be  done  by  machine.  From 
another  point  of  view,  the  rules  need  not  be  changed.  It  depends  on  just 
what  one  means  by  the  phrase  “changing  the  rules  of  braille.”  If  one  lists 
all  the  words,  and  indicates  how  they  are  to  be  contracted,  this  is  a  rule. 
This  is  a  different  rule  from  the  kind  of  rule  that  tells  us,  “You  must  not 
cross  syllable  boundaries.”  The  result  may  be  precisely  the  same,  which  is 
to  the  good  if  it  is  judged  that  the  braille  as  currently  written  is  the  best. 

In  other  words  I  am  not  proposing  to  alter  the  braille  codes  as  they 
are  currently  written  unless  there  are  good  reasons  to  do  so  from  the  point 
of  view  of  the  reader.  But  I  am  proposing  that  the  rules  be  restated  in  a 
machine-usable  form,  as  they  are  in  fact  now  being  applied  by  working 
programs.  The  other  comment  I  would  have  is  that  such  rules  as  the  use 
of  “to”  contraction,  and  “by”  contraction  (in  the  case  of  preposition  and 
not  in  the  case  of  adverb)  is  something  that  is  rather  difficult  to  mechanize; 
it  is  not  out  of  the  question,  but  it  would  take  a  rather  sophisticated  com¬ 
puter  program.  We  don’t  know  quite  how  to  do  this  completely  adequately. 
If  this  rule  were  restated  in  some  other  way  that  would  give  the  result 
intended  (or  very  close  to  it),  then  I  think  we  should  do  so,  and  we  should 
say  to  ourselves,  “This  is  machine  braille  that  we  are  using.” 


AUTOMATIC  BRAILLE  REPRODUCTION 


VIRGIL  E.  ZICKEL 

American  Printing  House  for  the  Blind,  Louisville,  Kentucky 


During  the  past  ten  years  some  effort  has  been  expended  to  produce  braille 
by  the  application  of  the  principles  and  techniques  of  automation.  This 
effort  was  initiated,  at  least  in  the  United  States,  by  the  Library  of  Congress; 
in  1954  the  American  Printing  House  for  the  Blind  (APH)  was  selected 
to  conduct  an  exploratory  survey  to  determine  the  need  and  the  areas  for 
which  research  might  result  in  an  improvement  in  the  quality  of  braille  and 
the  lowering  of  its  cost.  As  a  result  of  this  survey,  several  definite  projects 
were  selected  and  approved  by  the  Library  for  further  research  and  devel¬ 
opment  over  the  next  three  years. 

Several  studies  were  made  in  cooperation  with  the  University  of  Ken¬ 
tucky  to  determine  the  optimum  physical  dimensions  of  braille.  These 
studies  were  concerned  primarily  with  interdot,  intercell,  and  interline 
spacing. 

Some  effort  was  made  to  fabricate  a  braille  book  paper  that  would  be 
attractive,  have  maximum  classroom  life,  and  optimum  readability.  Under 
this  program  an  effort  also  was  made  to  improve  the  braille  plate  embossing 
machines  to  increase  operating  speed  and  raise  the  level  of  accuracy  of 
embossing.  Special  emphasis  was  laid  on  the  need  to  lessen  the  variation  in 
dot  height. 

Through  a  cost  analysis  of  braille  book  production  from  editing  through 
binding,  it  was  learned  that  the  braille  plate  making  operation  which  includes 
embossing,  proofreading,  and  correcting  is  the  largest  single  item  of  cost 
in  this  operation.  This  is  due  primarily  to  the  fact  that  most  runs  of  braille 
are  very  small.  The  Library  of  Congress,  for  illustration,  normally  orders 
28  to  30  copies  of  a  book;  for  this  number  the  plates  alone  account  for  over 
two-thirds  of  the  total  cost.  The  fact  that  plate  making  represents  such  a 
large  part  of  book  cost  suggests  that  research  in  this  area  could  be  reward¬ 
ing,  and  that  the  use  of  the  recent  developments  in  technology  might  result 
in  lower  cost  for  the  plates. 

A  braille  plate  consists  of  a  sheet  of  zinc,  iron,  or  aluminum  approxi¬ 
mately  .010  inch  thick,  9  Vi  inches  wide  and  25  inches  long.  This  plate  is 


403 


404  Man-Machine  Systems 

folded  to  form  a  double  sheet  of  metal  9 Vi  inches  by  YIV2  inches.  It  is  then 
placed  in  a  stereograph  machine  where  it  is  embossed  through  both  thick¬ 
nesses,  forming  in  effect  a  male  and  female  die.  After  embossing,  proof 
sheets  are  made  from  the  plates  and  are  checked  for  errors.  The  plates  and 
the  error  sheets  are  returned  to  the  operator  for  correcting.  These  three 
operations — embossing,  proofreading,  and  correcting — each  require  about 
the  same  amount  of  time.  The  actual  translation  is  done  by  the  operator  at 
the  same  time  as  the  embossing. 

A  systems  engineering  approach  to  the  braille  plate  making  operation 
suggests  that  a  three  unit  system  consisting  of  an  input  unit  for  originating 
and  verifying  punched  cards  or  tape,  an  intermediate  unit  (probably  a  com¬ 
puter)  in  which  the  actual  translation  would  be  accomplished,  and  a  third 
unit  from  which  the  output  of  the  computer  would  be  used  to  control  a 
braille  plate  embossing  machine,  could  logically  replace  the  present  manual 
operation. 

The  input  unit  presents  no  problem  since  punched  tape  and  card  ma¬ 
chines  are  in  everyday  use.  The  intermediate  unit  probably  presents  the 
greatest  problem,  since  it  involves  the  selection  of  the  computer  that  is 
most  practical  for  the  purpose,  and  many  factors  affect  this  decision,  two  of 
the  most  important  of  which  are  the  size  of  memory  and  the  actual  per  word 
cost  of  operation.  Although  the  output  unit  presents  several  problems  too, 
there  appeared  to  be  advantages  in  early  construction  of  this  unit  as  it  would 
prove  the  feasibility  of  automatic  control  of  the  embossing  machine,  the 
practicability  of  verifying,  and  could  result  in  some  immediate  saving  in 
plate  cost. 

In  1955  the  construction  of  the  first  tape  controlled  braille  plate  em¬ 
bossing  system  was  started  in  IBM’s  laboratory.  This  machine  consists  of  a 
braillewriter/tape  punch  unit  to  originate  the  tape,  a  tape  reader/braille- 
writer/tape  punch  unit  for  verifying  and  a  tape  reader/ stereograph  machine 
combination  for  th^  output.  Tape  control  was  selected  for  this  first  unit 
since  both  tape  punch  and  read  units  are  small  and  flexible,  which  re¬ 
sulted  in  the  entire  unit  being  a  small  efficient  machine. 

Through  several  years  of  use  in  production,  this  pilot  project  proved 
the  feasibility  of  tape  control  of  the  embossing  machine;  further,  that  veri¬ 
fying  could  be  used  in  lieu  of  proofreading.  Actually  the  level  of  accuracy 
obtained  through  verifying  proved  to  be  higher  than  that  obtained  with 
conventional  proofreading.  The  shape  and  height  of  the  braille  dots  are  also 
much  better  with  the  tape  controlled  unit,  since  corrections  are  not  made 
on  the  plates  as  they  are  with  the  conventional  system.  Braille  plate  cost 


Automatic  Braille  Reproduction  405 

with  this  unit  is  slightly  lower  than  with  the  manual  operation,  but  probably 
not  enough  to  justify  the  increased  cost  of  the  equipment. 

The  positive  results  obtained  with  the  tape  control  embossing  project 
encouraged  further  research  on  the  translation  unit,  and  in  October  1958, 
IBM  agreed  to  write  a  program  for  the  IBM  704  computer  to  translate 
fully  spelled  input  into  Grade  2  braille.  IBM  also  agreed  to  translate  12 
books  to  determine  if  machine  translation  could  afford  the  necessary  degree 
of  accuracy.  It  was  also  important  to  learn  whether  or  not  the  machine 
could  produce  acceptable  braille  format.  (‘Format’  means  page  number, 
correct  number  of  cells  per  page,  correct  number  of  lines  per  page,  etc.) 

It  was  agreed  that  APH  should  select  the  titles  to  be  translated  so  that 
they  would  present  as  nearly  as  possible  the  unusual  situations  encountered 
in  applying  the  braille  literary  code  in  day-to-day  production.  Special  con¬ 
sideration  was  to  be  given  to  unusual  format,  the  author’s  unique  style,  and 
word  usage.  Among  the  books  chosen  were  How  Big  Is  Big,  Kipling’s 
Jungle  books,  The  Fanny  Farmer  Cookbook,  Psychology  for  Living,  and 
New  Ways  In  Sex  Education. 

The  translation  program  was  developed  in  the  IBM  Department  of 
Mathematics  and  Applications  in  New  York  City.  The  actual  work  was  done 
by  Mrs.  Ann  Shack,  working  closely  with  the  braillist  at  the  APH.  As  we 
might  expect,  some  of  the  situations  that  we  felt  would  cause  trouble  failed 
to  do  so,  while  some  of  the  simpler  situations  (simple  at  least  from  the 
standpoint  of  the  manual  transcriber)  presented  the  biggest  problems.  For 
instance,  one  of  these  books,  an  elementary  reader,  Up  And  Away,  has  only 
a  few  lines  on  the  page  of  the  original  inkprint  edition,  the  type  is  large, 
and  the  lines  are  widely  spaced.  Normally  the  braille  version  of  a  book  of 
this  kind  is  written  with  twice  the  normal  line  spacing,  and  the  lines  on  the 
reverse  side  of  the  sheet  fall  between  the  lines  on  the  first  side.  This  situation 
coupled  with  the  fact  that  running  headings  are  placed  only  on  the  numbered 
pages  presented  one  of  the  toughest  format  problems  the  program  had  to 
solve. 

The  choice  of  the  IBM  704  computer  was  based  on  the  flexibility  the 
machine  affords  and  the  fact  that  it  provided  the  memory  or  storage  capacity 
that  the  project  seemed  to  require.  Since  the  704  used  punched  cards  for 
both  input  and  output,  it  was  decided  that  the  embossing  machine  at  APH 
should  be  fitted  up  for  card  control  as  well  as  tape  control,  so  that  a  con¬ 
version  from  cards  to  tape  would  not  be  necessary.  This  was  accomplished 
by  adapting  an  IBM  056  Verifier  as  the  card  reader  input  to  the  embossing 
machine. 


406  Man-Machine  Systems 

A  program  was  completed  and  machine  translation  of  braille  became  a 
reality  in  April  1959.  Several  revisions  were  necessary,  of  course,  before 
the  program  attained  its  present  form,  but  in  the  Spring  of  1962  the  last 
of  the  12  books  was  completed. 

It  should  be  emphasized  that  the  program  as  written  takes  no  liberties 
with  the  braille  code,  the  only  compromise  being  that  words  normally 
hyphenated  at  the  end  of  a  line  with  the  manual  system  are  not  hyphenated 
with  the  automated  system;  instead  the  entire  word  is  carried  over  to  the 
next  line. 

When  the  first  few  books  had  been  completed  and  it  appeared  that 
translation  was  feasible,  APH  decided  with  the  help  of  IBM  to  adapt  two 
more  stereograph  machines  for  card  control.  This  was  done  because  it  had 
been  found  that  to  emboss  a  plate  takes  approximately  five  minutes,  the 
equivalent  of  12  plates  an  hour.  This  capacity  was  far  too  small  and  did  not 
utilize  the  operator’s  time  efficiently.  With  three  output  machines  250  plates 
can  be  produced  each  eight-hour  day.  This  is  roughly  equivalent  to  one- 
half  the  total  production  of  the  Printing  House. 

The  experience  gained  from  these  12  books  suggests  that  machine 
translation  is  not  only  possible,  but  it  will  prove  to  be  faster  and  more  effi¬ 
cient  than  the  manual  method;  it  promises  to  be  entirely  satisfactory.  With 
a  table  look-up  type  of  program,  some  up-dating  will  have  to  be  done  con¬ 
tinuously.  It  is  felt,  however,  that  some  procedure  such  as  translating  the 
first  chapter,  then  reading  the  proof  listing  and  making  the  indicated 
changes,  will  enable  us  to  cope  with  any  unusual  style  or  word  usage  that 
is  encountered  in  a  particular  book  without  making  a  complete  retranslation 
necessary. 

The  accuracy  of  the  complete  system  seems  to  be  ensured  by  the  fact 
that  input  cards  are  verified,  that  a  proof  listing  will  be  read  at  least  in  part, 
and  that  the  embossing  machine  incorporates  the  fail-safe  principle,  thereby 
practically  eliminating  machine  errors.  The  only  remaining  factor  that 
might  make  the  system  impractical  is  economic  and  APH  has  given  this  a 
great  deal  of  serious  thought.  Whenever  possible  accurate  records  have 
been  kept  of  each  operation  and  we  know  within  reason  the  cost  of  punch¬ 
ing  and  verifying  the  input  cards;  we  know,  too,  the  cost  of  embossing  the 
metal  plates. 

Some  estimates  of  computer  cost  have  been  made;  however,  there  are 
many  factors  affecting  this  cost  other  than  the  actual  per  word  cost  of  the 
machine.  Since  a  computer  sophisticated  enough  to  handle  the  braille 
program  is  not  available  in  the  Louisville  area,  it  is  necessary  that  the  cards 


Automatic  Braille  Reproduction  407 

be  sent  to  and  from  a  computer  installation  in  another  city;  that  the  job 
be  fitted  into  the  computer  schedule  and  an  operator  familiar  with  the  braille 
project  be  provided.  Since  this  presents  so  many  incidental  problems,  the 
best  solution  found  so  far  is  to  have  a  member  of  the  APH  staff  actually 
carry  the  cards  to  the  computer  (after  arrangements  have  been  made  of 
course),  translate  the  cards,  and  bring  the  output  cards  back  to  Louisville. 
It  is  obvious  that  this  procedure  is  both  time  consuming  and  costly — so 
costly  that  it  appears  necessary  that  a  computer  be  obtained  for  use  at  APH 
if  machine  translation  is  to  be  a  production  operation. 

Since  the  cost  of  the  computer  is  so  indefinite,  we  might  approach  the 
economics  of  the  automated  system  in  reverse  order.  That  is,  we  might  take 
the  cost  of  a  plate  with  the  manual  system  and  substract  from  it  the  known 
cost  with  the  automated  system.  The  difference  would  then  be  the  amount 
that  could  be  allowed  for  the  unknown  operations.  Following  this  reasoning, 
we  find  that  the  average  cost  of  a  braille  plate  with  the  manual  system  is 
$1.80;  this  is  the  cost  of  one  numbered  page  of  approximately  900  char¬ 
acters.  With  the  proposed  automated  system  it  would  cost  50  to  55  cents 
to  prepare  the  input  for  a  similar  amount  of  material.  While  the  cost  of  the 
output  is  approximately  15  cents  per  page,  the  metal  costs  10  cents  per 
page.  These  amounts  add  to  approximately  80  cents.  The  difference  between 
this  amount  and  the  present  cost  of  the  manual  operation  would  then  be 
$1.00  per  page — the  absolute  maximum  that  could  be  allowed  for  the  com¬ 
puter  cost  and  still  make  machine  translation  economically  feasible. 

Pursuing  this  line  of  thought  further,  we  find  that  APH  produces  ap¬ 
proximately  120  thousand  braille  plates  per  year;  it  is  estimated  that  this 
figure  will  reach  150  thousand  plates  within  the  next  two  years.  Since,  how¬ 
ever,  a  sizable  amount  of  this  work  consists  of  mathematics,  music,  foreign 
languages,  and  other  technical  publications  for  which  programs  have  not 
been  written  and  machine  translation  may  not  be  practical,  it  seems  reason¬ 
able  to  assume  that  one-half  of  the  total  could  be  done  on  the  automated 
system.  Then  assuming  that  some  60  to  75  thousand  plates  could  be  ma¬ 
chine  produced  at  a  $1.00  per  page,  we  would  have  available  60  to  75 
thousand  dollars  as  a  maximum  we  could  spend  for  the  computer,  its 
maintenance,  and  its  operation  annually  and  still  be  within  the  present 
plate  cost. 

Personnel  to  operate  the  machine  would  probably  cost  7  to  8  thousand 
dollars  a  year;  maintenance  20  to  25  thousand  dollars  per  year;  and  an 
additional  10  thousand  dollars  will  be  needed  for  power  and  air  conditioning. 
If  the  installation  cost  can  be  written  off  a  saving  in  plate  cost  of  20  to  25 


408  Man-Machine  Systems 

thousand  dollars  a  year  might  be  realized.  This  is  equal  to  a  saving  of  20 
to  25  percent  in  the  cost  of  the  plates  made  by  this  process  over  those  made 
in  the  conventional  manner.  It  should  be  emphasized  that  these  are  only 
estimates;  however,  if  these  figures  are  reasonably  accurate  the  use  of 
machine  translation  could  result  in  a  saving. 

Looking  to  the  future:  additional  titles  should  be  translated  as  early 
as  possible,  as  there  is  much  to  be  learned.  A  great  deal  was  learned  from 
the  first  12  titles;  we  know  of  course  that  machine  translation  can  be 
faster,  probably  more  accurate,  and  far  more  flexible  than  the  manual  sys¬ 
tem,  but  the  actual  per  plate  cost  and  the  percentage  of  the  total  work  load 
that  can  be  handled  in  this  manner  efficiently  can  only  be  proven  in  regular 
day  to  day  operation. 

Since  a  computer  sophisticated  enough  to  handle  the  project  is  not  now 
available  in  the  Louisville  area,  it  seems  wise  to  take  steps  to  obtain  one  for 
use  at  APH.  There  are  several  colleges  in  Louisville,  including  the  Speed 
Engineering  School;  possibly  the  computer  needs  of  these  schools  can  be 
pooled  with  APH,  thereby  providing  computer  time  not  now  available  and 
at  a  reasonable  cost. 

Constant  development  underway  in  the  computer  field  suggests  that 
machines  of  this  type  will  continue  to  become  smaller,  less  costly  to  pur¬ 
chase,  and  easier  to  operate.  If  we  assume  that  these  developments  will 
result  in  a  lower  per  word  cost  while  the  cost  of  labor  continues  to  spiral  up¬ 
ward,  the  use  of  the  computer  becomes  more  of  a  necessity  than  a  conven¬ 
ience  or  a  novelty  if  we  are  to  meet  the  needs  of  blind  people  for  adequate 
reading  material. 


ENHANCING  THE  AVAILABILITY 


OF  BRAILLE* 

ROBERT  W.  MANN 

Massachusetts  Institute  of  Technology,  Cambridge,  Massachusetts 


INTRODUCTION 

The  striking  and  woeful  unbalance  between  the  availability  of  printed  and 
written  material  to  the  sighted  and  to  the  blind  has  inspired  the  study,  de¬ 
sign,  and  fabrication  of  systems  and  devices  to  enhance  the  blind  person’s 
access  to  the  sighted  person’s  world.  The  interest  of  Mechanical  Engineering 
faculty  at  MIT  in  the  problems  of  the  blind  was  initiated  by  the  biweekly 
Sensory  Research  Discussions  originated  and  chaired  since  the  fall  of  1959 
by  John  K.  Dupress,  Director  of  Technological  Research  of  the  American 
Foundation  for  the  Blind.  Initial  investigations  were  unsupported  and  con¬ 
ducted  entirely  within  the  context  of  undergraduate  laboratory  and  design 
project  and  theses  work.  Subsequently,  a  small  grant  from  the  American 
Foundation  for  the  Blind  made  possible  the  fabrication  of,  and  experiments 
with,  several  research  devices.  As  of  January  1961  the  project  has  had 
formal  support  from  the  Office  of  Vocational  Rehabilitation  of  the 
Department  of  Health,  Education  and  Welfare.  The  contract  funds  several 
research  and  development  tasks  in  the  fields  of  sensory  aids  and  prostheses 


*  The  work  reported  has  been  supported  primarily  by  the  Office  of  Vocational 
Rehabilitation  of  the  U.  S.  Government  Department  of  Health,  Education  and 
Welfare  under  contract  SAV-1004-61.  Mr.  John  K.  Dupress,  Director  of  Technologi¬ 
cal  Research,  American  Foundation  for  the  Blind  has  inspired  and  encouraged  our 
efforts,  including  his  sponsorship  of  a  small  grant  from  the  Foundation  which  sus¬ 
tained  student  work  on  the  early  stages  of  the  investigation. 

The  project  is  undertaken  in  the  context  of  our  academic  laboratory  (the  En¬ 
gineering  Projects  Laboratory)  and  so  benefits  directly  from  the  involvement  of 
faculty,  graduate,  and  undergraduate  students.  Professor  D.  M.  Baumann  has 
supervised  the  development  of  the  Brailler  and  Typewriter  Accessory  assisted  by 
graduate  research  assistants  D.  W.  Kennedy  and  G.  F.  Staack,  and  undergraduates 
D.  G.  Eglinton,  S.  A.  Lichtman,  and  J.  A.  Robertson.  Professor  E.  E.  Blanco  and 
graduate  student  K.  F.  Johansen  have  participated  in  the  development  of  the  braille 
transducers,  and  freshmen  A.  S.  Ivestor  and  P.  J.  Siemens  have  contributed  sig¬ 
nificantly  to  the  type  compositor’s  tape  to  mechanized  braille  study. 


409 


410  Man-Machine  Systems 

in  the  Engineering  Projects  Laboratory  at  MIT.  This  paper  is  concerned 
with  efforts  related  to  braille. 

BRAILLE  TRANSLATION  AND  REPRODUCTION 

Braille  material  is  made  available  to  the  blind  by  means  of  press  embossing 
of  those  textbooks  and  periodicals  which  warrant  distribution  in  excess  of 
several  copies.  Original  braille  transcriptions  are  produced  as  single  copies, 
on  braillewriters  (such  as  the  Perkins)  and  duplicated  by  recently  developed 
vacuum  forming  plastic  processes.  However,  it  is  generally  accepted  that 
these  facilities  for  braille  reproduction  severely  restrict  the  quantity  and 
range  of  material  in  braille  for  the  blind.  Dedicated  private  and  institutional 
efforts  coupled  with  government  subsidy  make  possible,  through  printing 
presses  such  as  the  American  Printing  House,  Howe  Press,  etc.,  the  re¬ 
production  in  quantities  incredibly  small  compared  to  inkprint  productions 
of  selected  titles  of  text  and  recreational  reading  material.  At  the  single 
copy  braille  transcription  level,  numerous  volunteers  translate  printed  ma¬ 
terial  into  braille  for  the  benefit  of  students,  professionals,  etc.  The  actual 
total  output  of  braille  material  is,  of  course,  restricted  first  by  the  produc¬ 
tion  capacity  of  existing  presses,  and  second  by  the  availability  of  geo¬ 
graphically  widely  distributed  volunteers  (154  groups  in  all)  who  have  the 
talent  and  time  to  prepare  individual  transcriptions.  Efforts  are  under  way 
to  increase  the  efficiency  of  the  machinery  for  braille  press  reproduction. 
Much  could  be  done  to  encourage  the  training  of  volunteer  transcribers 
and  to  provide  devices  which  will  make  more  efficient  their  efforts.  The 
description  in  this  report  of  the  braille  encoding  typewriter  accessory  and 
the  high  speed  brailler  are  examples  of  the  latter. 

In  all  these  cases  braille  reproduction  hinges  upon  the  availability  of 
a  sighted  reader  who  has  the  language  and  braille  training  and  experience 
necessary  to  convert  the  visual  printed  letter,  word,  or  page  into  its  cor¬ 
responding  braille  symbolism.  One  can  divide  the  activity  of  this  sighted 
braille  translator  into  two  steps:  the  visual  reading  of  the  inkprint  pattern 
and  the  intellectual  translation  of  that  inkprint  into  corresponding  braille 
symbols. 

These  two  processes,  pattern  recognition  and  translation,  are  fields  sub¬ 
ject  to  extensive  investigation  quite  apart  from  the  problem  of  blind  com¬ 
munication.  The  vast  problem  of  our  ever-growing  store  of  library  reference 
material,  particularly  in  technological  areas;  the  problem  it  poses  of  storage, 
referencing,  and  access;  the  problem  of  making  printed  symbolic  material 
available  directly  to  data  processing  equipment  for  correlation  and  analysis; 


Enhancing  the  A  vailability  of  Braille  411 

military  and  commercial  problems  of  identifying  particular  patterns  in  a 
field  of  view  for  aircraft  control,  surveillance,  and  other  reasons — all 
motivate  a  broad  spectrum  of  generously  funded  activity  in  the  pattern 
recognition  or  reading  machine  area.  While  progress  is  being  made  in  this 
field  we  are  still  far  from  the  point  where  a  machine  is  a  demonstrated  and 
effective  substitute  for  the  human  reader. 

The  second  activity  of  the  sighted  braille  transcriber  is  that  of  trans¬ 
lation — conversion  of  the  printed  ink  symbol  into  its  braille  counterpart. 
Goals  other  than  blind  communication  again  justify  a  substantial  investment 
of  time  and  money  in  this  field.  Language  translation  is  an  obvious  example. 
The  devising  of  computer  programming  codes  which  facilitate  either  input/ 
output  from  the  computer  to  its  environment,  or  the  manipulation  of  in¬ 
formation  inside  the  computer,  are  other  examples  of  work  in  the  translation 
area.  Specific  efforts  are,  of  course,  underway  to  devise  and  test  the  ade¬ 
quacy  of  computer  translation  of  alphabetical  word  input  to  a  computer  into 
its  corresponding  Grade  2  braille.*  These  efforts  illuminate  the  theorem 
applicable  to  all  machine  translation  programs:  only  where  clear  unambig¬ 
uous  rules  for  the  translation  correspondence  (no  matter  how  complicated 
they  may  be)  can  be  defined  can  the  machine  be  expected  to  carry  through 
the  process.  The  braille  translation  work  under  way  underscores  the  need 
for  studies  of  the  present  ambiguities  in  the  Grade  2  braille  code.  This  is 
a  separate  investigation  which  has  had  some  beginning  scrutiny  by  the  MIT 
group  (3). 

TYPE  COMPOSITOR  TAPE  TO 
MECHANIZED  BRAILLE  SYSTEM 

In  the  light  of  the  difficulties  of  mechanizing  pattern  recognition,  it  seems 
appropriate  to  ask  whether  in  the  preparation  of  braille  material  it  would 
be  possible  to  circumscribe  the  need  for  interpreting  and  encoding,  in 
machine-interpretable  form,  the  information  content  of  the  printed  page. 
Could  this  step  be  avoided  the  first  and  at  present  most  intractable  of  the 
contributions  of  the  sighted  braille  transcriber  could  be  avoided. 

In  fact  most  printed  material  (which  of  course  circumscribes  all  that 
material  under  any  circumstances  of  interest  for  translation  into  braille)  is 
at  some  time  prior  to  its  final  inkprint  publication  encoded  in  a  machine- 
interpretable  form.  This  development  comes  about  as  a  consequence  of 

*  As  for  example  the  work  of  Mrs.  Schack  of  IBM  in  the  cooperative  program 
between  IBM  and  the  American  Printing  House  for  the  Blind;  also  Dr.  Abraham 
Nemeth  of  the  University  of  Michigan. 


412  Man-Machine  Systems 

advances  in  the  printing  industry  itself  through  which  virtually  all  type 
composition  is  done  automatically;  and  also  as  a  consequence  of  the  fre¬ 
quent  desire  to  transmit  editorial  material  by  means  of  wire  between  a 
point  of  origin  and  the  point  of  publication.  Thus  we  have  the  Teletype¬ 
setter  wire  transmission  system,  and  Linotype,  Monotype,  and  Photon  type 
composition.  Each  of  these  is  a  somewhat  different  process  which  poses  a 
somewhat  different  problem  in  terms  of  utilization  for  blind  communication. 
But  all  offer  the  intriguing  possibility  of  capturing  the  content  of  the 
printed  page  in  an  encoded  form  directly  digestible  for  further  data  proc¬ 
essing,  rather  than  reliance  upon  the  sighted  reader  or  waiting  for  the  de¬ 
velopment  of  reading  machines. 

The  system  to  be  considered  then  consists  of  obtaining  the  material  to 
be  ultimately  presented  in  braille  in  the  form  of  type  compositors'  tapes, 
using  these  tapes  as  the  input  to  a  centralized  data  processing  system  which 
would  edit,  translate,  and  reproduce  the  information  in  a  form  suitable  for 
wide  scale  distribution  through  the  mails  to  individual  blind  persons  and  to 
libraries.  Recipients  would  have  transducers  which  would  convert  the  dis¬ 
tributed  form  of  the  material  into  the  equivalent  of  embossed  braille. 

The  feasibility  of  such  a  scheme  depends,  of  course,  upon  the  kind  of 
material  which  the  blind  need  and  want  and  its  availability  in  the  form  of 
type  compositor’s  tape. 

THE  READING  NEEDS  OF  THE  BLIND 

On  the  question  of  what  the  blind  want  there  is  unfortunately  little  in  the 
way  of  systematically  collected,  unbiased  data,  although  there  are  indications 
that  this  situation  will  improve.  The  blind’s  access  to  the  printed  page  in¬ 
cludes  braille  tactile  input,  aural  inputs  including  “talking  books”  tape  re¬ 
cordings,  and  listening  to  sighted  readers. 

In  the  way  of  specific  data  on  the  proportional  use  of  these  various 
techniques  by  the  blind,  a  frequently  cited  statistic  is  that  while  the  number 
of  current  titles  available  in  both  recorded  and  braille  form  are  about  the 
same  (with  the  aggregate  number  of  titles  available  in  braille  substantially 
larger)  according  to  the  records  of  the  Library  of  Congress,  only  3  percent 
of  the  380,000  legally  blind  population  in  the  United  States  borrowed  a 
braille  book  from  a  library,  whereas  some  30  percent  of  the  blind  population 
took  one  or  more  recordings  from  a  lending  source. 

The  hazard  of  too  literal  an  interpretation  of  this  data  is  that  first, 
neither  the  estimate  for  braille  books  nor  for  recordings  represent  the  total 
use  of  either  of  these  techniques  by  the  blind.  In  the  case  of  braille  the 


Enhancing  the  A  vailability  of  Braille  413 

figure  does  not  include  the  vast  amount  of  material  made  in  small  quantities 
for  use  in  the  elementary,  high  school,  and  college  education  of  the  blind 
as  well  as  for  professional  activities,  and  the  use  by  the  blind  of  braille  for 
their  own  personal  notation  and  correspondence.  In  the  case  of  recordings 
similar  exclusions  are  obvious.  The  second  reservation  on  casual  interpre¬ 
tation  of  these  figures  is  the  influence  of  the  type  and  total  amount  of  mate¬ 
rial  available  for  use  by  the  blind.  In  the  case  of  braille  books,  for  ex¬ 
ample,  the  total  annual  production  at  the  present  time  is  something  like 
160,000  braille  page  plates  per  year,  which  roughly  corresponds  to  half  that 
many  inkprint  pages.  Now  a  total  of  80,000  pages  a  year  represents  very 
few  books,  implies  very  few  titles,  and  implies  a  gross  least-common-de¬ 
nominator  basis  for  choosing  titles.  Beyond  this,  a  typical  press  run  is  28 
to  30  copies,  which  considering  the  national  distribution  of  the  estimated 
380,000  legally  blind  presents  an  incredible  waiting  list  problem  for  the 
popular  titles.  Similar  arguments  could  be  mounted  for  restrictions  on  the 
availability  of  aural  material.  Thus,  in  considering  the  various  means  by 
which  the  blind  can  take  advantage  of  existing  sensory  channels  for  com¬ 
munications  purposes  it  appears  unwise  to  attempt  to  draw  exclusive 
recommendations  between  alternatives.  Rather  we  should  attempt  to  en¬ 
hance  the  availability  and  efficiency  of  all  demonstrably  practical  modes 
as  well  as  to  explore  new  alternatives. 

The  spectrum  of  blind  individuals  constitutes  a  microcosm  of  the  total 
population.  A  wide  range  of  ages  and  vocations,  from  the  elementary 
school  child  learning  to  read,  through  high  school  and  college  training  which 
in  turn  ranges  from  fields  in  which  the  assimulation  of  a  great  deal  of  straight 
copy  is  essential  (as  in  the  liberal  arts)  to  fields  in  which  symbolic  repre¬ 
sentations  and  exceedingly  concise  presentations  dominate  as  in  the  physical 
sciences. 

A  recent  study  (2)  concerned  primarily  with  blind  college  students’ 
use  of  recorded  textbooks  provides  some  data  on  comparative  uses  of 
tactile  and  aural  reading  aids.  Thirty-four  percent  of  the  students’  queried 
(91  percent  of  the  402  blind  in  college  known  to  be  using  “Recordings 
for  the  Blind”)  preferred  recordings  for  all  their  reading;  an  additional  40 
percent  preferred  recordings  for  long  descriptive  nontechnical  texts;  91 
percent  preferred  recordings  for  light  reading.  But  in  the  very  areas  where 
braille  texts  are  scarcest  many  students  said  they  were  desirable:  the 
survey  indicated  a  demand  for  braille  texts  in  languages,  mathematics,  and 
complicated  materials.  About  60  percent  found  braille  important  in  reading 
and  learning  formulae  and  the  like.  These  preferences  are  at  least  in  part 


414  Man-Machine  Systems 

due  to  the  fact  that  a  reader’s  comprehension  increases  as  the  amount  of 
active  participation  in  reading  increases;  thus  the  braille  reader  who  must 
follow  the  text  with  his  fingers  finds  it  easier  to  comprehend  than  the  passive 
listener  who  uses  recordings.  Also,  it  is  easier  for  the  user  of  braille  to  re- 
iterate  phrases  he  fails  to  grasp  on  first  reading.  For  grade  school  children, 
where  integration  of  blind  and  sighted  children  in  public  schools  rather  than 
the  blind  residential  system  is  becoming  more  predominant,  it  is  important 
to  have  a  display  of  the  text  corresponding  as  nearly  as  possible  to  the  books 
their  sighted  peers  are  reading. 

Beyond  educational  reading  needs  there  is  the  professional  literature, 
scientific,  medical,  legal,  or  otherwise,  whose  availability  to  the  blind  is 
severely  restricted  due  to  the  vast  and  growing  amount  of  material  and  the 
small,  varied,  and  unpredictable  need  of  parts  of  it  by  individual  blind 
persons.  Then  there  is  the  wide  range  of  recreational  reading,  including 
periodical  and  full-length  books,  informational  reading  including  contem¬ 
porary  news  media,  etc. 

The  general  lack  of  sociological  data  on  the  reading  needs  for  the  blind 
and  the  scope  of  the  project  at  MIT  has  made  it  impossible  to  carry  out 
any  comprehensive  study  of  these  needs.  But  a  study  of  the  available 
literature,  results  of  conferences  (some  of  them  held  at  MIT)  concerning 
communications  with  and  for  the  blind,  conversations  with  numerous 
knowledgeable  blind  and  those  concerned  with  blind  rehabilitation,  have 
indicated  that  at  the  very  least  additional  braille  transcribed  information  is 
essential  for  the  educational  process  from  the  elementary  through  the 
college  levels,  particularly  those  stages  at  which  the  child  is  actively  learn¬ 
ing  to  read  and  when  the  teenager  or  young  adult  is  studying  in  a  field  which 
involves  symbolic  and  concise  notation,  including  at  all  levels  those  learn¬ 
ing  processes  in  which  active  involvement  of  the  learner  are  essential.  A 
second  prime  need  is  the  whole  field  of  professional  journals  which  are 
essentially  unavailable  to  the  blind  except  through  services  of  individual  and 
cooperative  volunteers.  A  third  obvious  though  less  numerous  need  is  that  of 
the  approximately  3000  deaf-blind  for  whom  the  tactile  input  is  the  only 
communication  channel. 

THE  AVAILABILITY  OF  APPROPRIATE 

TYPE  COMPOSITOR’S  TAPES 

On  the  availability  of  type  compositor  tapes  of  material  of  especial  interest 
to  the  blind,  a  partial  survey  of  publishers  has  revealed  that  between  40  per¬ 
cent  and  65  percent  of  the  elementary  school  texts  published  by  major 


Enhancing  the  A  vailability  of  Braille  4 1 5 

houses  are  printed  using  either  Monotype  (the  most  common  equipment) 
or  other  tape  operated  systems  such  as  a  modified  Linotype.  Beyond  the 
elementary  level  use  of  Monotype  dwindles,  since  it  is  a  somewhat  more 
expensive  process  than  the  widely  used  Linotype.  But  the  exception  to  this 
rule  is  in  the  field  of  mathematics  and  science,  the  very  area  in  which  braille 
texts  are  most  needed,  texts  are  printed  almost  universally  in  Monotype. 
In  the  field  of  professional  journals,  those  in  scientific  fields  again  use 
Monotype  exclusively  (one  large  publisher  reported  their  intention  of 
changing  to  the  Photon  method,  another  tape  operated  process). 

The  punched  typesetting  tape  must  be  made  available  to  the  agency 
sponsoring  the  transcription  into  braille.  The  publishers  surveyed  uniformily 
expressed  their  willingness  and  ability  to  make  the  tapes  available  and  to 
cooperate  concerning  copyright  restrictions.*  Since  the  tape  is  only  used 
once  in  setting  up  galleys  of  the  printed  books  and  is  then  discarded  there 
are  no  intrinsic  technical  reasons  it  cannot  be  used  for  an  input  to  a  trans¬ 
cription  system. 

An  important  obstacle  to  the  Monotype-to-braille  system  is  the  multi¬ 
plicity  of  differences  between  the  printed  product  and  the  text  contained 
on  the  tape.  For  example,  many  of  the  headings  that  appear  in  a  printed 
book  or  magazine  are  set  by  hand  and  inserted  into  the  galleys  of  text. 
Illustrations,  displays,  tables,  and  other  visual  aids  are  inserted  later. 
In  the  case  of  braille  transcription,  these  latter  pose  a  special  problem,  for 
these  displays  must  be  presented  in  a  modified  form  for  the  benefit  of  a 
blind  reader.  At  present  braille  texts  leave  out  all  illustrations  except  those 
essential  to  the  content;  for  these  are  substituted  either  simplified  embossed 
line  diagrams  or  brief  descriptive  paragraphs.  In  any  case  this  material 
must  receive  special  treatment. 

Even  within  the  textual  material  itself  the  Monotype  tape  contains 
many  deviations.  Errors  in  the  tape  are  corrected  directly  in  the  galleys  of 
type.  Often  unusual  characters  or  special  symbols  will  not  be  contained  on 
the  Monotype  tape;  later  they  are  inserted  in  the  galleys  of  lead.  Last- 
minute  changes  in  content  of  textbooks  may  be  made  by  hand  without  using 
the  Monotype  machine.  Taken  together  these  differences  comprise  a  very 
sizeable  error.  This  error  must  not  be  carried  over  into  the  transcriptions 
made  for  the  blind,  but  its  correction  may  require  much  time  and  effort. 

*  Replies  were  received  from  Ginn  and  Company;  Scott,  Foresman  and  Com¬ 
pany;  Laidlaw  Brothers;  Row,  Peterson  and  Company;  and  the  U.  S.  Government 
Printing  Office,  as  well  as  from  the  American  Mathematical  Society  and  the  American 
Institute  of  Physics. 


416  Man-Machine  Systems 

The  Linotype  source  represents  potential  advantages  relative  to  Mono¬ 
type  provided  provision  is  made  for  the  simultaneous  generation  of  a 
punched  tape  on  those  Linotype  machines  which  mechanically  compose 
the  tape  while  providing  no  permanent  record.  Wherever  Linotype  is  used 
for  remote  or  multiple  type  setting  such  a  tape  is  prepared.  However,  in 
many  of  the  older  Linotype  machines  no  such  provision  is  made.  Either 
the  utilization  of  standard  Teletypesetter  equipment  or  a  typewriter  acces¬ 
sory,  such  as  is  described  later  in  this  report,  could  be  used.  The  Linotype 
process  has  the  intrinsic  characteristic  that  since  type  is  cast  in  a  single  slug 
comprising  an  entire  line,  errors  are  more  awkward  to  correct  than  in  a  case 
of  Monotype  where  single  letters  can  be  removed  and  replaced.  Thus  with 
Linotype  once  one  has  line-length  slugs  (or  the  tape  equivalent  of  them) 
one  is  assured  of  the  absence  of  typographical  errors.  Beyond  this  the  Lino¬ 
type  is  frequently  used  in  conjunction  with  the  Teletypesetter  process  by 
which  means  editorial  material  is  set  from  a  point  of  composition  to  a 
remote  point  of  publication.  In  such  cases  the  tape  is  a  pluperfect  edition 
of  the  final  copy,  since  all  typographical  and  editorial  corrections  have  been 
made.  Generally  speaking  Teletypesetter  tape  represents  current  periodical 
news  and  editorial  commentary.  While  as  a  general  rule  most  of  the  blind 
receive  such  information  through  standard  radio,  the  deaf-blind  could 
maintain  contact  with  their  contemporary  world  only  were  it  practically 
possible  to  convert  these  Teletypesetter  tapes  into  their  braille  equivalent. 

MECHANIZED  BRAILLE  DISPLAYS 

Large  scale  effective  utilization  of  the  type  compositor’s  tape  to  mechanized 
braille  system  discussed  in  the  previous  section  of  the  report  implies  the 
availability  of  simple,  small,  rugged,  and  reliable  transducers  with  which 
the  distributed  medium  can  be  interpreted  by  the  blind  as  embossed  braille. 
Beyond  this  use  such  transducers  would  be  very  useful  in  research  investi¬ 
gations  on  variations  of  braille  coding  and  presentation  methods.  There 
are,  for  example,  ambiguities  in  the  present  Grade  2  braille  code  from  a 
human  interpretation  point  of  view  which  suggest  study  quite  aside  from 
possible  changes  in  the  code  which  would  facilitate  machine  interpretation 
and  translation  letter-by-letter  into  Grade  2.  Actually  the  primary  con¬ 
sideration  of  human  interpretability  is  really  inseparable  from  the  secondary 
consideration  of  machine  translation,  since  the  complications  and  ambigui¬ 
ties  in  the  code  which  slow  down  and  frustrate  the  human  reader  are  in 
many  cases  the  same  vicissitudes  which  make  machine  interpretation 
impossible.  Thus  while  respecting  both  the  evolution  of  contracted  braille 


417 


Enhancing  the  A  vailability  of  Braille 

and  the  extensive  study  of  it  by  teachers  of  and  workers  with  the  blind,  it 
is  deemed  desirable  on  a  research  and  investigatory  level  to  explore  ad¬ 
vantages  which  might  be  derived  from  changes  in  the  code  or  going  beyond 
the  present  braille  and  considering  quite  different  modes  of  presentation. 
Theoretical  studies  of  such  alternatives  can  postulate  possible  advantages, 
but  the  effectiveness  of  changes  can  be  demonstrated  only  after  patient, 
thorough  experimentation  leads  to  acceptance  by  the  blind.  The  existence 
of  mechanized  tranducers  would  greatly  facilitate  the  educational  and  psy¬ 
chophysical  testing  essential  to  the  evaluation  and  acceptance  of  such 
variations. 

Experiments  at  the  MIT  Engineering  Projects  Laboratory  (MIT/EPL) 
have  been  carried  out  thus  far  on  three  transduction  approaches,  all  based 
on  punched  tape  as  the  information  storage  distributed  or  experimental 
medium.  Fortunately  the  hole  spacing  on  standard  punched  paper  tape 
corresponds  to  cell  spacing  in  braille  symbolism.  Thus  the  braille  symbol  to 
be  “embossed”  by  the  transducers  can  be  punched  into  tape  and  used 
directly  to  position  pins  with  heads  shaped  like  the  braille  embossing, 
up  to  correspond  to  an  active  cell  or  down  to  indicate  the  absence  of  a 
braille  “bump.”  Figure  1  illustrates  the  use  of  hemispherical-head  pin 
elements  in  this  approach.  Figure  5  is  an  instrument  using  this  principle 
in  which  the  presentation  is  analogous  to  a  standard  page  presented  a  line 
at  a  time.  The  tape  feeds  from  left  to  right  and  is  advanced  one  line 
segment  by  means  of  the  lever  shown.  As  an  alternative,  a  continuous 
presentation  scheme  (see  Figure  6)  utilizing  a  belt  has  also  been  built. 
Figure  7  is  the  original  device.  A  second  improved  model  is  just  about 
completed.  We  plan  to  experiment  with  both  these  devices  in  order  to 
understand  more  completely  the  kinesthetics  of  the  blind  person’s  hand 
and  finger  as  he  reads  braille  in  order  to  shed  some  light  on  how  the 
standard  presentation,  line  by  line  on  a  fixed  page,  compares  with  the 
somewhat  more  passive  continuous  presentation. 

Figure  3  illustrates  a  second  technique  for  direct  mechanical  tape  to 
braille  output.  Small  ball  bearings  are  entrained  by  holes  in  the  tape,  in 
this  case  a  hole  corresponding  to  the  presence  of  a  braille  bump.  An 
embodiment  of  this  scheme  is  illustrated  in  Figures  8  and  9.  The  motion 
of  the  tape  strips  ball  bearings  from  a  hopper.  The  balls  are  restrained 
and  the  tape  is  supported  (at  the  equator  of  the  balls)  by  grooves  in  a 
magnetized  supporting  plate.  A  receiver  and  vibrator  would  return  the 
balls  to  the  supply  hopper. 

A  third  scheme  which  has  had  some  cursory  investigation  is  suggested 


418  Man-Machine  Systems 

in  Figure  4.  A  tufted  or  flocked  textile  fabric  would  be  arranged  so  that 
fibers  could  extend  through  the  holes  in  the  punched  tape,  thus  providing 
a  tactile  stimulation.  Samples  of  a  number  of  textile  fabrics  have  been 
investigated  and  additional  inquiries  as  to  suitable  material,  textures,  fiber 
distribution,  etc.,  have  been  made. 

Finally,  Figure  2  illustrates  the  use  of  air  jets  as  the  tactile  stimulator. 


TAPE 


COMPRESSED  AIR  REGION 
Figure  2  Basic  Transducer  Elements:  Compressed  Air  Jets 


Figure  3  Basic  Transducer  Elements:  Spherical  Elements 


Figure  4  Basic  Transducer  Elements:  Tufted  Fabric 


Enhancing  the  A  vailahility  of  Braille 


419 


Figure  5  Line  At  a  Time,  Pin  Element  Tape  to  Braille  Transducer 


.  i  i 


Figure  6  Belt  Type  Continuous  Reader 


An  MIT/EPL  Report  by  Lester  Saslow,  describes  psychophysical  research 
conducted  using  this  stimulation  technique  (4).  In  Mr.  Saslow’s  apparatus 
the  air  jets  were  controlled  somewhat  differently,  but  the  tape  could  be 
used  directly  as  a  pneumatic  valve  as  indicated.  This  scheme  implies  the 
availability  of  compressed  air,  which  in  turn  requires  additional  ma¬ 
chinery  and  input  power  and  therefore  contravenes  to  some  extent  our 


420  Man-Machine  Systems 


Figure  7  Continuous,  Pin  Element  Tape  to  Braille  Transducer 


MAGNETIC  SURFACE 

TRANSVERSE  SECTION 

Figure  8  Continuous  Ball  Reader 

original  design  goals  of  a  simple,  possibly  portable,  device.  In  this  same 
regard  it  might  be  appropriate  to  observe  that  the  IBM  Belt  Reader  ( 1 )  in 
which  pins  were  translated  in  a  thick  plastic  belt  for  continuous  presenta¬ 
tion  as  braille  involved  considerable  complication  and  complexity.  A  tape 
reader  converted  punched  tape  to  electrical  impulses  which  were  processed 
and  converted  by  electromagnets  into  a  mechanical  motion  which,  in  turn, 
set  up  the  pins. 

HIGH  SPEED  ELECTRIC  BRAILLE 
AND  TYPEWRITER  ACCESSORY 

A  complementary  project  in  the  Engineering  Projects  Laboratory  MIT 
program  considers  the  flow  of  typewritten  correspondence,  memoranda, 


Enhancing  the  A  vail  ability  of  Braille 


421 


Figure  9  Ball  Element  Tape  to  Braille  Transducer 


reports,  etc.,  which  constitutes  the  life’s  blood  of  day-to-day  office  routine, 
from  which  the  blind  individual  is  either  largely  excluded,  or  in  a  restricted 
way  cued  belatedly  through  the  generosity  of  aural  or  braille  transcribers 
and  the  blind  individual’s  persistence.  To  ameliorate  this  situation  an 
accessory  to  any  standard  or  portable  typewriter  has  been  designed  and 
constructed  which  converts  each  key  depression  into  the  generation,  simul¬ 
taneously  with  the  printed  figures,  of  a  corresponding  encoded  and  am¬ 
plified  electrical  signal. 

The  typewriter  accessory  will  operate  a  high  speed  brailler,  which  is 
completely  designed  and  partially  fabricated.  In  order  to  operate  at  electric 
typewriter  speed  (120  words  per  minute),  to  compensate  for  the  almost 
2  to  1  line  length  magnification  necessary  in  going  from  the  typewriter 
to  braille,  and  to  provide  for  7  cell  combinations  (as  the  “capital  letter”), 
the  brailler  employs  three  light  weight  printing  heads  arranged  in  tandem, 
with  the  first  head  completing  a  braille  line  as  the  second  head  starts 
the  next  line  (see  Figure  10).  Solenoid  operated  transposers  (driven 
directly  from  the  typewriter  accessory  or  any  other  appropriately  coded 
electrical  input)  set  up  the  braille  character  in  the  heads,  whereupon 
embossing  takes  place  by  platen  movement.  Figure  1 1  illustrates  the 


422 


Man-Machine  Systems 


36-40  CELL  LOCATIONS  IN  PLATE M 


ZL 


ri'l  l"l  I  I  I  I  I  I  IT  I  1  T  I  I  I  |  I'TT  I  I  l~l‘l  I  I  I  I  I  I  I  1  I  I  t'T~l 


1 


THE  HEADS  ARE  SPACED  ON  THE  CHAIN  SUCH  THAT  THERE  IS  ALWAYS  EXACTLY 
ONE  PRINTING  CELL  UNDER  THE  ACTIVE  PLATEN  AREA  AT  ANY  GIVEN  TIME, 
I.E.,  NO  DEAD  ZONE 


Figure  10  Operation  of  Heads  and  Transport  Mechanism 


Figure  1 1  MIT  Brailler  Printing  Head 


Enhancing  the  A  vailability  of  Braille  423 


Figure  12  Block  Diagram  of  MIT  Braille  Printer 


braille  head  with  the  braille  “cell”  on  the  top  surface  indicating  size. 
Figure  12  gives  the  over-all  block  diagram  of  the  brailler  while  Figure  13 
shows  a  partial  assembly. 

The  typewriter  accessory  (see  Figure  14)  is  mechanical,  optical,  and 
electronic,  with  plastic  shutters  transposed  by  key  depression  so  as  to 
obstruct  14  parallel  light  rays,  whose  impingement  on  photodiodes  registers 
the  signal.  The  encoding  of  the  plastic  shutters  is  accomplished  by  breaking 
off  the  appropriate  blades  on  each  shutter. 


424  Man-Machine  Systems 


Figure  13  MIT  Brailler:  Partial  Assembly 


Figure  14  MIT  Brailler  Typewriter  Accessory 


Enhancing  the  A  vailability  of  Braille  425 

While  designed  to  augment  each  other,  the  typewriter  accessory  and 
brailler  can  also  be  used  independently.  Thus  the  accessory  might  be  used 
for  direct  input  for  wire  transmission,  or  the  brailler  used  with  separate 
key  board  (both  one-  and  two-hand  key  boards  will  be  designed),  or  as 
a  slave  to  any  other  signal  transmission  system. 

CONCLUSIONS 

Research  and  development  have  been  directed  toward  making  braille  more 
easily  and  more  widely  available.  Consideration  has  been  given  to  the  blind 
person’s  particular  needs  for  braille  (as  compared  with  aural  transmission) 
in  the  learning  process,  in  professional  literature,  and  for  the  deaf-blind. 
The  utility  and  availability  of  type  compositor’s  tape  as  a  source  of  braille 
translation  and  transcription  has  been  explored.  Problems  involved  in 
editing  tape  sources  to  achieve  correspondence  with  inkprint  text  have 
been  touched  on  and  require  further  investigations.  Mechanized  braille 
displays  for  both  system  output  and  braille  research  have  been  devised. 
Refined  models  for  psychophysical  testing  are  under  construction.  A  high 
speed  electric  brailler  which  will  operate  at  electric  typewriter  speed  from 
any  of  a  variety  of  inputs  has  been  designed  and  partially  fabricated.  An 
accessory  for  a  standard  or  portable  typewriter  which  simultaneously 
provides  brailler  input  has  been  developed. 

REFERENCES 

1.  Bryce,  James  W.,  and  John  N.  Wheeler.  “Reading  apparatus.”  U.  S.  Patent  Office, 

September  5,  1950.  (United  States  Patent  Number  2,521,338.) 

2.  Development  of  Basic  Research  Materials  and  a  Manual  on  the  Use  of  Recorded 

Textbooks.  New  York:  Recordings  for  the  Blind,  Inc.,  Appendix  A,  p.  14. 

3.  Minutes  of  the  Conference  on  Automatic  Data  Processing  and  the  Various  Braille 

Codes,  MIT.  New  York:  American  Foundation  for  the  Blind,  1961. 

4.  Saslow,  Lester.  Tactile  Communication  Using  Air  Jets.  Cambridge:  Massachusetts 

Institute  of  Technology,  (Report  No.  8768-1  of  the  Engineering  Projects 
Laboratory  to  the  Office  of  Vocational  Rehabilitation  of  the  Department  of 
Health,  Education,  and  Welfare). 

FURTHER  REFERENCES 

Dickman,  R.  J.  “An  Encoder  for  a  Grade  II  Braille  Typewriter.”  S.  M.  Thesis,  De¬ 
partment  of  Electrical  Engineering,  Massachusetts  Institute  of  Technology,  1960. 
Eglinton,  D.  G.  “Preliminary  Design  of  the  Mechanical  to  Electrical  Coding  Con¬ 
version  for  a  Typewriter  to  Brailler  Converter.”  S.  B.  Thesis,  Department  of 
Mechanical  Engineering,  Massachusetts  Institute  of  Technology,  1961. 

Lichtman,  S.  A.  “The  Design  of  a  High-Speed  Slave  Brailler  for  a  Braille  Converter 
Device.”  S.  B.  Thesis,  Department  of  Mechanical  Engineering,  Massachusetts  In¬ 
stitute  of  Technology,  1961. 

Mapes,  R.  R.  “Tactile  Reading  Aid  for  the  Blind.”  S.  B.  Thesis,  Department  of 
Mechanical  Engineering,  Massachusetts  Institute  of  Technology,  1960. 


426  Man-Machine  Systems 

Mann,  R.  W.  “Rehabilitation  Via  Engineering  Skills,”  Rehabilitation  Record,  Vol. 
3,  No.  1.  Washington,  D.  C.:  U.  S.  Department  of  Health,  Education,  and  Wel¬ 
fare,  1962. 

Staack,  G.  F.  “A  Study  of  Braille  Code  Revisions.”  S.  M.  Thesis,  Department  of 
Mechanical  Engineering,  Massachusetts  Institute  of  Technology,  1962. 

Traver,  A.  E.,  Jr.  “Design  of  Tactile  Reading  Aid  for  the  Blind.”  S.  B.  Thesis,  De¬ 
partment  of  Mechanical  Engineering,  Massachusetts  Institute  of  Technology, 
1961. 


STE-RE:  A  SYSTEM  OF  COMMUNICATIONS 


WILLIAM  H.  STEVENSON,  WILLIAM  J.  REAVES,  and 
WALLACE  A.  WARREN 

Reynolds  Tobacco  Company,  Winston-Salem,  North  Carolina 


INTRODUCTION 

All  forms  of  sophisticated  communications  invented  by  man  have  one 
common  denominator:  a  code  must  be  created  and  learned,  memorized. 
The  printed  or  spoken  word  is  made  up  of  simple  codes  for  letters  which 
alone  mean  nothing;  these  simple  codes  are  assembled  into  groups  making 
words  (a  more  complex  code)  which  do  mean  something. 

All  man-made  codes  used  for  communication  are  dependent  on  the 
visual,  aural,  or  tactile  acuity  of  the  nervous  system.  No  one  method  is 
any  more  “natural”  than  the  other.  A  blind  person’s  greatest  concern  is 
in  the  practical  use  of  any  reading  or  writing  system  or  device.  The  method 
of  coding  and  the  sensory  system  selected  determine  the  ease,  speed,  and 
over-all  versatility  necessary  to  a  satisfactory  system. 

The  purpose  of  our  research  and  experimentation  was  to  determine 
or  confirm  the  parameters  used  in  the  reduction  of  Ste-Re.  We  are  well 
aware  of  the  fact  that  our  experiments  and  tests  were  not  conducted  with 
the  best  controls  and  under  the  best  conditions;  the  test  equipment  was  not 
in  some  cases  as  accurate  as  we  would  have  liked,  nor  were  the  tests 
extensive  enough  for  us  to  accept  the  results  as  confirmed.  We  did  with 
what  we  could  get  and  what  we  could  afford,  and  accepted  the  results 
as  indicative  enough  for  us  to  proceed. 

These  tests  will  not  be  presented  chronologically,  but  rather  as  we 
feel  they  best  suit  the  expository  purpose  of  this  paper. 

EXPLORATORY  TESTING 

A  comparative  analysis  was  made  among  the  printed  books,  braille 
books,  and  talking  books.  The  results  of  this  comparison  were  used  in  time 
and  motion  and  other  studies. 

To  understand  braille  tactile  reading  more  completely,  an  experiment 
was  devised  to  determine  the  amount  of  pressure  necessary  at  the  end  of 


427 


428  Man-Machine  Systems 

the  reading  finger  to  interpret  the  braille  code.  For  this  experiment  seven 
braille  readers  aged  25  to  60  years  were  chosen.  Five  were  expert  readers 
capable  of  reading  over  400  characters  per  minute.*  Three  were  very  poor 
readers,  worked  with  their  hands,  and  were  poorly  educated.  These 
three  have  studied  but  never  used  braille. 

The  method  used  was  to  fasten  a  card  containing  several  hundred 
braille  characters  on  the  surface  of  a  direct  reading  balance  scale.  The 
scale  chosen  was  accurate  to  within  2.268  grams  and  was  heavily  dampened 
to  minimize  bounce  or  inertia.  The  subject  placed  his  reading  finger  on  the 
card  and  identified  the  characters  by  calling  them  out.  The  pressure  he 
applied  was  recorded. 

All  the  subjects  were  right-handed.  All  subjects  used  only  one  finger 
to  read  braille;  six  used  the  forefinger  of  the  right  hand  and  the  seventh 
used  the  forefinger  of  the  left  hand.  The  purpose  of  the  test  was  explained 
to  the  subjects. 

The  result  was  that,  with  only  one  exception,  all  persons  tested  applied 
a  pressure  of  approximately  45  grams  while  reading  and  22  grams  while 
returning  to  the  next  line.  The  one  exception  (a  qualified  reader)  applied 
12  grams  less  pressure  in  each  operation. 

The  pressure  range  within  which  braille  cells  can  be  identified  is  very 
narrow  and  requires  highly  developed  tactile  perceptual  judgment.  Too 
little  pressure  applied  does  not  stimulate  sufficient  number  of  nerve  endings 
to  permit  positive  identification;  too  great  a  pressure  puts  the  cutaneous 
tissues  under  general  tension  and  identification  is  impossible.  One  of  the 
subjects  had  heavily  calloused  skin,  yet  the  pressure  he  applied  was  the 
same  as  used  by  the  other  subjects.  It  took  him  much  longer  to  identify 
the  characters  and  he  made  errors,  but  he  did  not  try  to  push  harder. 

Of  the  two  receptors  of  the  cutaneous  tissues,  we  believe  that  only 
the  sense  of  touch  ( tan gorecep tors)  is  stimulated  by  the  braille  cell.  We 
do  not  believe  that  cutaneous  pain  (algesiroleptors)  is  evoked. 

That  tactile  perceptual  judgment  can  be  improved  and  developed 
with  practice  has  been  fairly  well  established.  Volkmann,  as  early  as  1858, 
found  that  the  distance  at  which  two  points  are  felt  can  be  halved  with 
practice  (8).  Tichener  (6,  7),  Dresslar  (1),  Solomons  (4),  and  Tawney 
(5)  among  others  all  have  worked  with  the  two  point  limen  on  the  skin  and 
confirm  this  increased  acuity  with  training;  however,  this  perception  is  lost 

*  The  University  of  Louisville,  as  a  result  of  testing  250  braille  readers,  determined 
that  the  average  reading  speed  is  between  70  and  80  words  (350  to  400  characters) 
per  minute. 


Ste-Re  429 

very  rapidly  with  disuse.  Gibson  indicates  that,  “There  is  a  basic  loss  with 
disuse  of  improvement  in  the  two  point  limen  on  the  skin”  (2). 

Braille  depends  on  a  six  point  limen  on  the  skin.  Only  a  limited 
number  of  blind  persons  can  ever  hope  to  become  proficient  braille  readers. 

TIME  AND  MOTION  STUDIES 
OF  BRAILLE  READING 

The  purpose  of  this  study  was  to  determine  how  braille  books  are  used 
by  blind  persons.  How  long  does  it  take  to  turn  pages?  Do  they  back- 
read?  What  are  the  reading  habits?  Most  sighted  slow  readers  have  the 
“see  and  hear”  habit;  do  the  blind  slow  readers  have  a  comparable 
“touch  and  hear”  habit? 

Eight  persons  were  studied.  One  above-average  reader  obviously  raced 
and  skipped  when  he  heard  the  camera  turn  on.  Good  observations  were 
obtained  for  only  three  of  the  subjects.  The  remaining  five  were  eliminated 
because  of  insufficient  readings  and  our  inability  to  correlate  mathematical 
calculations  with  stop  watch  observations.  We  did  not  have  a  sufficient 
number  of  qualified  subjects  to  justify  a  comprehensive  time  and  motion 
analysis  with  computer  aid.  Even  without  electronic  methods  a  thorough 
analysis  can  be  made  of  a  small  number  of  subjects. 

That  each  reader’s  methods  follow  a  pattern,  and  that  each  reader’s 
pattern  is  his  own  is  hardly  a  new  observation,  but  within  these  patterns 
a  number  of  general  observations  were  made.  Braille  readers  do  unnces- 
sarily  reread  frequently.  This  habit  of  many  sighted  readers,  who  reread 
on  average  11  out  of  every  100  words,  is  one  of  the  first  bad  habits  broken 
in  learning  speed  reading. 

Some  rereading  or  back-reading  is  necessary  with  braille  due  to  con¬ 
fusion  created  by  contractions.  The  average  time  required  to  turn  a  page 
and  locate  a  beginning  cell  varies  with  age  and  ability,  from  1.82  seconds 
to  2.74  seconds;  one  of  the  three  persons  who  qualified  for  the  complete 
study  required  3.78  seconds.  The  time  required  to  go  from  the  last  cell 
of  one  page  to  the  first  cell  of  the  next  page  facing  is  0.905  second  to 
1.40  seconds.  Grade  2  braille  readers  must  turn  5.78  pages  for  every  one 
page  turned  by  sighted  readers. 

The  average  time  required  to  go  from  the  last  cell  of  one  line  to  the 
first  cell  of  the  next  line  is  0.42  second  to  0.47  second.  Grade  2  braille 
readers  must  do  this  2.3  times  more  frequently  than  sighted  readers;  how¬ 
ever,  one  subject,  immediately  upon  reaching  the  last  cell  of  a  line  with 
his  right  hand,  began  reading  the  next  line  with  his  left  until  he  could  bring 


430  Man-Machine  Systems 

his  right  hand  to  the  new  position.  Another  subject  did  this  part  of  the  time. 
Little  time  is  lost  between  lines  with  this  method — 0.125  second,  on  the 
average. 

The  mechanics  of  reading  braille  are  physically  tiring  since  the  reader 
must  keep  both  hands  in  partial  suspension  to  read  properly  and  to  avoid 
damaging  the  cells.  The  weight  of  a  hand  varies  greatly  from  the  “big 
man”  to  “little  girl”  condition.  The  effective  pressure  range  has  been 
established  as  very  narrow;  guide  fingers  and  support  fingers  must  bear 
very  lightly.  Total  pressure  for  reading  and  support  fingers  is  about  114 
grams.  One  can  well  understand  that  reading  slows  down  considerably 
with  time;  to  this  we  must  add  the  energy  required  to  lift  the  hand  to 
new  lines,  new  pages,  etc.  For  example,  my  hand  and  that  portion  of 
my  arm  necessarily  supported  while  reading  braille  is  a  little  more  than 
786  grams.  We  might  conclude  that  a  lot  of  energy  is  required  to  read 
braille. 

The  time  lost  in  reading  braille  varies  with  the  subject.  Fifty  percent 
of  the  time  lost  is  incurred  in  going  from  line  to  line,  page  to  page,  and 
page  to  page  turning;  the  remaining  50  percent  is  due  to  back-reading, 
fumbling,  confusion  in  interpretation,  physical  tiring,  and  slowing  down. 

TACTILE  ACUITY 

We  chose  to  redefine  tactile  acuity  in  three  ways : 

( 1 )  Actual  Acuity  (as  applied  to  braille) , 

( 2 )  Effective  Acuity  ( as  applied  to  braille ) , 

(3 )  Potential  Acuity  (as  applied  to  Ste-Re;  see  below) . 

Actual  acuity  may  be  defined  as  the  average  number  of  braille  cells 
translated  per  second,  minus  all  time  and  motion  losses.  To  determine 
it  the  subjects  were  timed  while  reading  several  lines  of  32  cells  each, 
first  cell  to  last  cell,  during  the  first  part  of  the  test.  They  were  not  fatigued. 
These  lines  were  picked  at  random.  We  did  not  consider  time  between 
cells  as  lost  time. 

Effective  acuity  may  be  defined  as  the  average  number  of  braille  cells 
per  second  translated  by  a  subject,  including  all  lost  time  and  motion  for 
sustained  reading  of  between  4000  and  5000  characters.  Tests  to  de¬ 
termine  actual  and  effective  acuity  were  conducted  concurrently.  Subjects 
were  not  told  to  read  one  line  as  fast  as  possible,  nor  were  they  told 
when  these  observations  were  taking  place.  Frequency  of  turning  pages, 
changing  from  line  to  line,  etc.,  as  related  to  the  printed  book,  were  not 


Ste-Re 


431 


used  as  factors  in  evaluating  the  percent  of  time  and  motion  lost.  The 
results  obtained  from  three  persons  who  qualified  for  this  test  are  shown 
in  Table  1. 


table  1 

RESULTS  OF  BRAILLE  ACUITY  TEST 


Acuity 

Test 

Subject 

Age 

Ejfective 

Actual 

Loss 

G 

28 

7.80 

10.20 

30% 

B 

58 

9.40 

11  .00 

17% 

E 

68 

5.75 

8.10 

40% 

Average 

50.3 

7.28 

9.77 

29% 

In  summarizing  these  studies,  we  concluded  that  all  readers  lose  time 
in  reading  braille  books.  All  factors  considered,  we  believe  that  the  average 
blind  reader  of  Grade  2  braille  spends  about  25  percent  of  his  reading  time 
on  the  mechanics  necessary  in  reading  braille. 

AUDIBLE  CODE  LANGUAGES 

The  simplest  audible  code  is  a  system  of  wave  forms  of  constant  frequency 
divided  into  units  of  time.  The  dot-dash  system  of  the  International  Morse 
Code  at  500  cycles  per  second  is  a  typical  example:  the  dot,  one  unit  of 
time,  the  dash,  three  units  of  time,  requires  a  number  of  time  units  to 
represent  a  single  letter  or  a  single  thought.  In  reading  the  letter  “J,”  the 
reader  encodes  a  dot  and  three  dashes,  each  separated  by  one  time  unit; 
three  time  units  separate  letters  and  seven  time  units  separate  words. 
These  time  units  command  a  mental  readout  of  previously  stored  time 
units,  and  the  trained  reader  responds  to  “J.”  Despite  the  rapidity  of 
computation  this  system  has  two  major  disadvantages.  First,  the  mind 
must  store  different  time  units  until  a  readout  command  (additional  time 
units)  is  given.  This  is  very  slow  and  difficult.  Second,  speed  of  reading 
is  limited  to  the  individual’s  perception  of  time;  if  the  code  is  too  fast 
and  if  it  approaches  the  threshhold  of  blend,  time  elements  can  not  be 
differentiated  and  obviously  interpretation  becomes  impossible. 

Presently  the  International  Code  is  not  spoken  or  written  as  dot- 
dash;  but  rather  this  rapidly  produced  code  signal  actually  passes  the 
threshhold  of  blend  for  individual  sounds  and  becomes  a  rhythm — a  single 


432  Man-Machine  Systems 

rhythm  represents  a  letter  instead  of  merely  a  dot  and  three  dashes.  This 
“rhythm  reading”  is  also  limited  to  the  threshhold  of  blend  for  group 
sounds.  The  average  well-trained  operator  does  not  exceed  38  words  per 
minute.  To  exceed  this  speed  a  group  of  rhythms  must  be  memorized  as 
a  single  thought  representing  complete  words.  This  will  increase  the  pos¬ 
sible  speed  for  highly  trained  persons  to  40  to  50  words  per  minute.  The 
known  limit  of  Morse  Code  reception  is  about  70  words  per  minute.  A  first 
class  radiotelegrapher  must  be  able  to  send  and  receive  25  words  per 
minute.  This,  the  simplest  of  audible  codes,  is  also  the  slowest,  and  to 
approach  its  outside  limits  of  speed  requires  much  practice  and  memory 
work. 

The  simple  and  complex  stimuli  just  described  use  a  code  dependent 
upon  elements  of  time.  Another  audible  system  utilizes  frequency  varia¬ 
tions  of  equal  time  units  to  represent  a  code.  We  have  chosen  to  divide 
this  method  into  two  parts,  one  for  complex  stimuli  and  one  for  multiple 
complex  stimuli.  Complex  stimuli  may  be  defined  as  different  sounds 
superimposed  one  on  another  to  represent  one  thing;  for  example,  the 
musical  chord  of  “C”  representing  the  letter  “C.”  The  advantage  of  this 
method  over  the  time  unit  method  is  that  one  time  unit  represents  one 
code,  as  opposed  to  a  number  of  time  units  representing  one  code.  All 
other  relationships  are  the  same  as  the  unit  of  time  system.  Multiple  com¬ 
plex  stimuli  may  be  thought  of  as  groups  of  complex  stimuli  rapidly 
produced  to  result  in  a  blend  of  chords  representing  complete  words.  This 
system  resembles  blend  reading  of  the  time  unit  system,  but  extends  it 
one  step  further.  This  tone  difference  has  one  advantage  over  the  time 
method:  obviously  it  is  faster  once  learned.  The  disadvantage  is  that  it  is 
much  more  difficult  to  learn  and  requires  a  very  high  aural  acuity  develop¬ 
ment.  As  compared  to  braille,  one  would  have  to  memorize,  identify,  and 
reflectively  respond  to  63  different  chords  in  various  blend  forms. 

The  remaining  audible  code  methods  are  the  spoken  word  of  talking 
books,  and  artificial  voice  and  stored  word  devices.  The  main  advantage 
of  the  spoken  word  is  that  it  is  easy  to  use  and  in  most  cases  (unless  one 
desires  to  learn  a  foreign  language)  does  not  require  any  learning.  The 
disadvantages  are  that  speed  can  not  be  changed  to  suit  the  individual; 
it  is  difficult  to  index;  personal  interpretation  is  impossible;  it  requires 
the  use  of  earphones  if  one  does  not  wish  to  disturb  others  in  the  vicinity; 
as  embodied  in  a  system  so  far  conceived,  it  cannot  be  taken  to  class. 
Some  persons  interviewed  who  use  talking  books  complained  they  are  too 
slow,  others  complained  because  they  are  too  fast.  The  only  possible 
room  for  improvement  is  in  the  economy  of  production. 


Ste-Re 


433 


All  audible  codes  have  one  major  disadvantage;  their  use  requires 
the  ability  to  hear.  Sometimes  in  our  efforts  in  one  field  we  are  apt  to 
forget  problems  in  others.  Five  percent  of  the  blind  of  this  country  are 
deaf  and  many  more  have  impaired  hearing.  These  persons  cannot  be 
ignored;  to  solve  their  problem  separately  would  make  economic  justifi¬ 
cation  difficult. 

To  develop  a  system  of  communications  that  would  help  the  blind, 
the  deaf  and  blind,  and  the  deaf  would  help  not  only  the  greater  number 
of  handicapped  but  would  make  the  solution  more  economical.  We  agree 
that  the  spoken  word  is  one  easy  way  out  of  this  dilemma;  we  question 
that  it  is  the  best  way. 

THE  STE-RE  SYSTEM 

The  more  we  study  this  problem,  the  more  we  come  to  believe  that  a 
common  element  exists  in  perceptual  judgment  throughout  the  nervous 
system.  Our  ability  to  interpret  meaning  and  ideas  from  various  stimuli 
is  much  the  same  whether  by  sight,  touch,  or  sound.  The  significant 
difference  seems  to  be  in  how  complex  a  complex  stimuli  can  be  associated 
with  a  single  thought.  The  faculty  of  vision,  which  enables  us  to  scan 
groups  of  words,  can  associate  a  complex  group  with  a  single  thought;  touch 
and  sound  are  more  limited.  There  are  many  advantages  and  disadvantages 
to  both  the  tactile  and  aural  methods  of  communications.  We  have  chosen 
to  use  tactile  stimuli  in  our  work,  however,  for  the  following  reasons.  We 
believe  it  is  potentially  faster;  that  it  will  serve  the  greater  number  of 
handicapped  persons;  it  is  more  versatile;  and  it  is  easier  to  learn.  We 
have  thus  incorporated  the  following  factors  in  our  approach: 

1 )  Tactile  stimuli  output 

a)  An  opened  six-point  expandable  code 

b)  Output  stimulation  to  any  cutaneous  tissues 

sensitive  to  excitation 

c)  Variable  stimuli  intensities 

2)  Facilities  for  writing  as  well  as  reading 

a)  Manual  input  operable  by  one  hand 

b)  Provision  for  manual  or  automatic  writing 

3 )  Portability 

4 )  Economical  operation 

5 )  Adaptability  to  other  systems  and  developments 

6)  A  63  code,  expandable  at  need 

7)  Elimination  of  searching  and  physical  movement  on  the  part  of 
the  reader. 


434  Man-Machine  Systems 

We  have  decided  that  it  would  be  valueless  in  this  context  to  describe 
our  experiments  in  electromechanics. 

Physically,  Ste-Re  is  composed  of  three  fundamental  units:  the  con¬ 
trol  panel,  the  electromechanical  encoding  or  decoding  units,  and  the 
nerve  response  unit. 

The  control  panel  shown  (see  Figure  1)  is  an  isolated  unit  approxi¬ 
mately  2  inches  by  6  inches  by  8  inches  and  weighs  15  to  20  ounces.  It 
consists  of  five  buttons  (one  for  each  finger)  and  a  treadle  bar,  and  it 
can  be  used  to  “write”  63  different  characters. 

To  the  student  or  writer  each  button  is  represented  by  a  number  ( 1 


through  5) ;  the  treadle  bar  is  Number  6.  Depressing  any  one  or  a  combina¬ 
tion  of  buttons  simultaneously  activates  the  electronic  coding  system  and 
“writes”  the  code  representing  the  letter  or  character  desired;  for  example, 
the  simultaneous  depression  of  buttons  1,  3,  and  4  creates  the  electronic 
code  for  the  letter  “M.”  The  Ste-Re  code  numbering  system,  as  related 
to  the  operator  or  reader,  is  with  only  two  or  three  exceptions  the  same 
as  the  braille  numbering  system. 

Ste-Re  coding  as  related  to  the  electronics  of  the  system  is  quite  dif¬ 
ferent.  The  buttons  are  numbered  by  a  binary  count,  from  1  through  2,  4, 
8,  16,  and  32  (the  treadle  bar).  Each  character  is  represented  by  a  single 
number  between  1  and  63.  Thus,  the  letter  “M”  is  13  or  the  binary  number 
101100.  The  five  buttons  and  treadle  bar  are  coupled  so  that  if  the  hand 
is  resting  on  the  treadle  bar  only  binary  numbers  between  1  and  31  will 


Ste-Re 


435 


respond.  If  the  treadle  bar  is  released,  only  numbers  between  33  and  63 
will  respond.  The  purpose  of  this  is  to  eliminate  the  heel  or  treadle  bar 
pumping. 

All  other  letters  of  the  alphabet  and  all  of  the  principle  punctuation 
marks  are  represented  by  the  binary  number  for  numbers  between  1  and 
31.  Complying  with  the  numbering  system  of  braille  coding  has  neces¬ 
sitated  exceptions  to  this  rule  four  times. 

The  user  is  seldom  required  to  support  the  weight  of  his  hand  while 
writing  as  is  required  with  a  typewriter  or  while  using  a  braillewriter. 

Figure  2  also  shows  an  expanded  control  panel  with  several  additional 


Figure  2  Expanded  Control  Panel  With  Additional  Controls 


controls.  It  has  six  buttons  and  a  treadle  bar.  Moving  the  thumb  to  this 
sixth  button  and  using  the  panel  in  the  described  way  provides  31  addi¬ 
tional  codes  which  may  be  used  to  activate  mechanical  action  of  associated 
equipment:  typewriters,  adding  machines,  etc.  This  sixth  button  is  placed 
on  the  face  of  the  control  panel  because  we  believe  it  will  be  frequently 
used  by  students  or  stenographers.  The  three  position  dial  at  the  side 
provides  three  additional  code  systems  which  could  be  used  for  the  signs 
and  symbols  of  mathematics,  medicine,  etc. 

The  expanded  panel  will  be  slightly  larger  and  heavier  than  the  basic 
unit.  All  control  panels  are  electromechanically  the  same.  The  expanded 
unit  is  just  electronically  ‘more  of  the  same,’  and  can  be  expanded  further 
if  required. 


436  Man-Machine  Systems 

Figure  3  shows  a  simplified  block  diagram  of  the  encoding  unit.  The 
control  panel  pushbuttons  1  through  6  set  the  counters.  The  counters  can 
be  activated  by  other  means  than  the  control  panel:  typewriters,  magnetic 
and  punched  tape,  typesetter  tape,  and  so  on.  The  pulse  generator  (40) 
generates  negative  pulses  at  approximately  2500  cps.  The  pulses  are  fed 
to  “and”  gates  41,  42,  and  43.  The  pulses  are  blocked  at  these  gates  until 
a  control  signal  is  applied  to  terminal  “B”  of  the  gates. 

When  the  buttons  are  depressed,  the  Number  30  series  contacts  “make” 
and  causes  the  timed  monostable  multivibrator  (44)  to  change  from  reset 
to  set  for  several  cycles;  while  set  it  sends  a  control  signal  to  gate  43, 
which  opens  for  a  few  cycles  to  allow  pulses  to  pass  to  the  bistable  multivi¬ 
brators  46  and  47.  These  change  from  reset  to  set  simultaneously.  The 
output  that  results  from  the  bistable  multivibrator’s  changing  state  causes 
gate  41  to  open,  and  the  result  of  the  other  bistable  multivibrator’s  chang¬ 
ing  state  causes  gate  42  to  open.  Since  the  action  is  synchronized,  gates  41 
and  42  open  simultaneously.  Gate  41  allows  pulses  to  pass  directly  to  the 
recording  head,  while  gate  42  allows  the  pulses  to  pass  directly  to  the 
binary  counters  51  through  56.  The  counters  count  the  number  of  pulses 
as  they  are  recorded.  The  counters  are  bistable  multivibrators  and  are  so 
connected  that  counter  51  changes  state  on  every  pulse,  counter  52  on 
every  two  pulses,  and  counter  53  on  every  four  pulses,  and  so  on.  The 
set  and  reset  outputs  of  each  counter,  51  through  56,  are  connected  through 
the  contacts  of  pushbuttons  1  through  6  to  gate  80. 

To  summarize  this  action:  the  pushbuttons  initiate  the  counting  cycle 
and  also  preset  the  counters  to  a  given  number  of  pulses;  when  the  counters 
have  counted  the  preset  number  of  pulses,  gate  80  sends  a  signal  to  the 
bistable  multivibrator  46  and  causes  it  to  go  from  set  to  reset;  the  output 
of  46  closes  gate  41  and  in  turn  stops  pulses  to  the  recording  head. 
Hence,  the  preselected  number  of  pulses  representing  a  given  character  are 
recorded. 

Although  pulses  have  been  stopped  to  the  recorder,  the  counter  con¬ 
tinues  counting  until  63  pulses  have  passed.  On  the  64th  pulse,  counter  56 
sends  a  signal  simultaneously  to  the  bistable  multivibrator  47  and  the 
monostable  multivibrator  8 1 .  The  output  of  47  is  fed  to  gate  42  and  closes 
the  gate  to  pulses  entering  the  counter,  while  multivibrator  81  sends  a 
positive  pulse  to  the  recording  head.  This  positive  pulse  is  called  the 
“readout”  pulse.  Since  multivibrator  81  is  a  timed  monostable  multivi¬ 
brator,  it  returns  to  reset  automatically.  At  the  64th  pulse  all  equipment 
comes  to  reset,  awaiting  the  next  character.  The  time  to  write  one  character 


Ste-Re 


437 


Figure  3  Simplified  Block  Diagram  of  the  Encoding  Unit 


438 


Man-Machine  Systems 

is  approximately  .026  of  a  second,  based  on  a  2500  cps  pulse  generator. 
The  recording  head  “writes”  onto  a  memory  disc. 


Memory  Disc 

The  term  memory  disc  is  chosen  to  differentiate  it  from  regular  records 
carrying  speech  or  music.  All  records  are  memory  discs;  the  Ste-Re 
memory  disc  is  therefore  nothing  more  than  a  record. 

Based  on  an  8 14 -inch  microgroove  record  (220  grooves  per  inch),  the 
total  number  of  words  possible  on  one  side  varies  from  400  to  10,000; 
producing  discs  with  400  words  per  side  permits  reading  2  to  10  words  per 
minute  for  training;  producing  discs  with  10,000  words  per  side  permits 
reading  50  to  250  words  per  minute  for  trained  readers.  These  discs  may 
be  mass  produced  from  expensive  or  inexpensive  materials,  depending  on 
the  purpose  or  need,  by  stamping  just  the  same  as  with  any  other  record. 

There  are  several  advantages  over  talking  books  in  creating  the  master 
and  in  mass  production  that  affect  both  quality  and  economy.  The  most 
significant  advantage  economically  is  shown  best  by  Figure  4. 

The  actual  number  of  words  per  side  of  a  Ste-Re  disc  is  optional  and 
should  be  determined  by  practicality  and  efficiency.  As  an  example,  utiliza¬ 
tion  of  two  modified  waveforms  and  stereophonic  cutting  on  a  10-inch 
record  will  provide  32,000  words  without  changing  the  linear  speed  under 
the  stylus  or  increasing  the  speed  of  the  pulse  generator  (both  of  which 


Figure  4  Comparison  of  Space  Requirements:  Characters  Per  Square 
Inch 


Ste-Re 


439 


are  possible).  An  increase  to  64,000  words  (i.e.,  both  sides)  would  not 
increase  the  mass  production  costs,  and  would  increase  the  cost  of  the 
master  only  slightly.  Fidelity  and  sonic  quality  is  unimportant  with  Ste-Re; 
neither  sound  studios  nor  performers  are  necessary.  The  master  can  be 
made  simply  by  a  typist,  a  Ste-Re  operator  under  most  any  conditions, 
a  typesetter  tape  (at  virtually  no  cost),  or  in  many  other  ways. 

The  wave  forms  are  constant  and  equal  in  amplitude  and  frequency; 
this,  added  to  the  mechanical  design  of  the  reader,  makes  erratic  behavior 
by  the  needle  almost  impossible.  This  fact  makes  inexpensive  record  ma¬ 
terial  such  as  Soundscriber  discs  very  practical.  The  cost  of  the  disc  for 
personal  use,  such  as  for  notes  or  correspondence,  would  be  about  $.05 
to  $.07  per  2000  words,  depending  on  writing  speed. 

Taking  all  factors  into  consideration  it  is  probable  that  in  moderate 
quantities  it  will  be  cheaper  to  make  a  Ste-Re  book  from  a  typesetter  tape 
or  by  a  typist  than  to  make  a  printed  periodical.  A  Ste-Re  version  of  the 
Reader's  Digest  could  be  made  for  the  cost  of  mailing  a  braille  edition  to 
some  areas.  (Despite  the  “no  charge”  agreement  with  the  Federal  gov¬ 
ernment,  postage  is  not  free.) 

Figure  5  shows  a  complete  writing  unit  with  integral  controls.  This 
case  also  houses  all  the  encoding  electronics.  The  manual  and  mechanical 
operations  of  the  writer  are  the  same  as  for  the  manual  and  mechanical 
operations  of  the  reader  and  both  will  be  explained  at  the  same  time. 
To  operate  the  writer,  one  places  a  memory  disc  in  the  slot  shown.  At 
this  point  the  forward  and  reverse  buttons  can  be  activated.  Move  the 
load-run  lever  to  the  “run”  position  and  begin  to  write. 

When  one  has  finished  writing,  one  pushes  the  Stop  button,  turns  the 
lever  to  load-index  position  and  removes  the  disc.  If  one  had  written  a 
composition  for  classroom  use,  and  had  a  typewriter  coupled  to  the  Ste-Re 
writer,  a  typed  original  copy  and  a  Ste-Re  disc  (readable  and  editable  by 
you  who  are  blind)  would  be  made  at  the  same  time.  Perhaps,  having 
finished,  you  find  that  you  need  an  extra  typed  copy.  Place  the  completed 
disc  into  a  reading  machine,  couple  your  typewriter  to  this  machine,  and 
turn  it  on;  automatically  the  typewriter  will  make  another  copy.  Since 
the  mechanical  device  is  variable  in  speed  the  copy  could  be  made  at  rates 
up  to  250  words  per  minute. 

To  use  the  reading  machine  the  control  procedures  are  exactly  the 
same.  One  would  probably  make  more  use  of  the  available  controls,  par¬ 
ticularity  of  the  speed  control,  which  is  operable  while  the  disc  is  in 
motion.  If  one  wishes  to  reread  several  lines,  move  the  lever  to  load-index 


440  Man-Machine  Systems 

position,  reverse  it  for  several  seconds,  move  the  control  to  “run,"  and 
push  the  forward  button.  To  locate  a  particular  page,  say  page  714,  place 
one  finger  of  the  left  hand  on  the  seventh  indexing  mark  and  push  the 
forward  (or  reverse)  button  until  the  moving  locator  is  opposite  the 
seventh  mark.  Release  the  forward  (or  reverse)  button  and  turn  the 
manual  crank  32/2  turns.  Move  the  lever  to  the  operating  position,  select 
the  speed  wanted,  push  the  forward  button  (the  reverse  button  will  not 
work  in  the  operating  position),  and  begin  to  write.  This  entire  operation 
can  be  carried  out  in  less  than  a  minute. 

The  primary  purpose  of  the  mechanical  operations  of  the  Ste-Re 
System  is  to  drive  the  memory  disc  at  a  constant  linear  speed  under  the 
pickup  head,  or  at  a  constantly  increasing  angular  speed  with  respect  to 
the  position  of  the  pickup  head.  The  direct  result  of  this  is  to  physically 
store  more  information  in  a  given  circular  space.  The  constant  linear  speed 
under  the  pickup  head  is  accomplished  by  a  constant  rotary  speed  drive 


Figure  5  Complete  Writing  Unit  with  Integral  Controls 


Ste-Re 


441 


wheel  which  maintains  a  constant  position  relative  to  the  pickup  head. 
The  method  for  accomplishing  this  motion  is  shown  in  Figures  6  and  7. 

Friction  wheels  480  and  489  are  driven  at  constant  speed.  A  drive 
wheel  (489)  rotates  against  and  turns  the  friction  driven  disc  (491)  and 


Figure  7  Mechanical  Operation  of  the  Memory  Disc  System  II 


442  Mem-Machine  Systems 

main  drive  shaft  (492).  The  main  drive  shaft,  through  the  pressure  head 
(493)  rotates  the  turntable  (460)  and  memory  disc  (466)  at  the  same 
speed  as  the  friction  driven  disc.  The  bevel  gear  (495),  which  is  attached 
to  the  main  drive  shaft  and  friction  drive  disc,  rotates  the  bevel  gear  (496), 
threaded  shaft  (497),  and  gear  train  (498),  (499),  and  (500),  which 
turns  the  threaded  shaft  (501).  Threaded  shafts  497  and  501  act  as  screw 
feeds  for  the  rotary  drive  wheel  and  the  pickup  head  and  maintain  a  con¬ 
stant  relation  between  these  two  components.  With  the  rotary  speed  of  the 
drive  wheel  constant,  the  speed  of  the  turntable  driving  action  and  the 
screw-feeding  action  is  determined  by  the  location  of  the  wheel  relative 
to  the  center  of  the  friction  driven  disc;  the  closer  to  the  center  of  the  disc, 
the  faster  are  the  associated  actions.  This  will  provide  constant  linear  speed 
relative  to  the  memory  disc  pickup  head. 

All  actions  so  far  described  are  dependent  solely  on  the  speed  of  the 
driven  wheel  (480).  If  desired  these  relative  actions  may  be  accelerated 
or  decelerated  by  changing  the  speed  of  this  wheel,  which  obtains  its 
motion  from  a  drive  motor  through  disc  476.  The  motor  drives  disc  476  at 
a  constant  speed.  This  disc  rotates  the  driven  wheel  (480),  which  in 
turn  rotates  drive  wheel  489.  The  rotating  speed  of  drive  wheel  489  may 
be  varied  by  changing  the  position  of  the  driven  wheel  480  relative  to  the 
center  of  drive  disc  476:  the  closer  to  the  center  of  the  drive  disc  the 
slower  the  relative  speed.  Controls  are  provided  to  alter  the  disc  wheel 
relationship  by  a  manually  operated  screw  feed  for  driven  wheel  480. 

A  means  has  also  been  provided  for  high  speed  indexing  of  the  pickup 
head  and  drive  wheel.  If  the  turntable  and  memory  disc  are  lowered  the 
following  will  occur: 

1)  The  main  drive  shaft  and  pressure  head  will  lower  to  preset 
position. 

2)  The  pickup  head  and  memory  disc  will  separate. 

3)  The  drive  wheel  and  driven  disc  will  separate. 

4)  Bevel  gears  495  and  496  will  disengage. 

5)  Bevel  gear  507  will  engage  with  both  bevel  gears  (496  and  508). 

This  last  action  will  create  a  high  speed  drive  for  threaded  shafts  497 
and  501,  which  will  give  a  fast  indexing  action.  The  turntable  and  friction 
driven  disc  will  be  inactive  since  they  have  been  separated  from  their  source 
of  motion.  A  manual  fine  indexing  is  provided  through  the  hand  crank 
(512  and  513)  to  gear  499. 

An  indicator  (405)  which  operates  in  conjunction  with  gear  499  will 


Ste-Re  443 

give  the  operator  both  visual  and  tactile  indications  of  the  location  of  the 
pickup  head  and  drive  wheel. 

The  drive  motor  is  interlocked  with  the  mechanical  operations  to  ac¬ 
complish  the  following: 

1)  The  motor  will  run  forward  only  if  the  turntable  is  in  the  normal 
“read”  position. 

2)  The  motor  is  totally  inactive  while  raising  or  lowering  the  turntable. 

3)  The  motor  will  run  both  forward  and  backward  only  if  the  turn¬ 
table  is  in  the  “load”  and  “index”  position  in  which  the  pickup 
head  is  not  in  contact  with  the  memory  disc  and  the  drive  wheel 
is  not  in  contact  with  the  driven  disc. 

Positive  positioning  of  the  memory  disc  is  provided  by  the  positioning 
lugs  (467  and  468). 

The  only  controls  required  for  operation  are: 

1)  A  load  run  lever  (402)  to  raise  and  lower  the  turntable 

2)  A  speed  control  knob  (403)  to  vary  the  readout  speed 

3)  Three  pushbuttons  to  stop,  start,  and  reverse  the  drive  motor 

4)  A  crank  (513  and  512)  to  fine  index  manually  the  drive  wheel 
and  pickup  head. 

Figure  8  shows  the  block  diagram  of  the  decoding  unit.  The  readout 
head  (90)  changes  the  recorded  pulses  into  voltage  pulses.  The  negative 
pulses  are  amplified  (92)  and  fed  directly  to  counters  101  through  106. 
The  counters  will  count  the  negative  pulses  that  represent  the  character. 


Figure  8  Block  Diagram  of  the  Decoding  Unit 


444  Man-Machine  Systems 

The  counter  outputs  are  fed  into  “and”  gates  111  through  116  in  the 
same  combination  as  they  were  encoded. 

The  positive  readout  pulse  following  the  negative  pulses  is  fed  to  the 
amplifier-inverter  (93)  which  in  turn  triggers  the  timed  monostable  multi¬ 
vibrator  (94)  from  reset  to  set  for  a  short  interval.  The  output  of  the 
multivibrator  is  fed  to  “and”  gates  111  through  116.  At  this  instant  every 
gate  that  has  a  counter  input  will  open  and  cause  the  nerve  response  solenoid 
to  be  energized  for  a  given  time  interval.  At  the  end  of  this  time  interval 
multivibrator  94  returns  to  its  normal  state  and  sends  a  reset  signal  to  all 
counters.  The  decoder  is  ready  for  the  next  character. 

Nerve  Response  Unit 

Whereas  the  nerve  response  unit  shown  in  Figure  9  is  a  left-hand  unit,  a 
nerve  response  unit  can  be  used  on  any  part  of  the  body  where  the 
cutaneous  tissues  are  sensitive  to  stimulation. 


The  left-hand  unit  makes  it  possible  to  read  with  the  left  hand  and 
to  write  with  the  right  hand.  Examples  of  this  use  would  be  in  taking  notes 
in  class  while  reading  a  book,  editing  one’s  own  paper,  etc.  An  arm,  or 
leg  unit  could  be  used  by  amputees,  arthritics,  by  musicians  who  can  play 
and  read  music  at  the  same  time,  etc. 

A  response  unit  is  just  the  reverse  of  a  writing  unit.  Instead  of  pushing 
any  one  or  a  combination  of  buttons  to  write  a  character,  any  one  or  a 
combination  of  pins  make  contact  with  the  five  fingers  and  heel  of  the 


Ste-Re 


445 


hand,  or  six  other  points  on  another  part  of  the  body.  The  impact  of  these 
pins  is  controllable  (from  heavy  or  forceful)  for  old  or  heavily  calloused 
hands,  to  as  lightly  as  required. 

STE-RE  TRAINING  PROGRAM 

During  February  of  last  year  a  five  week  experimental  program  was 
started  at  Fort  Bragg,  North  Carolina,  with  approximately  20  enlisted 
volunteers  of  various  mental  levels  participating.  The  purpose  of  this  ex¬ 
perimental  training  was  to  determine  the  best  physical  design  for  the 
control  panel  and  response  unit  and  to  develop  a  method  to  teach  Ste-Re 
to  blind  persons. 

As  a  result  of  this  experiment  the  equipment  was  modified  and  changed 
extensively.  The  code  was  modified  and  we  established  one  acceptable 
method  of  teaching  Ste-Re.  Despite  these  modifications  and  related  prob¬ 
lems,  along  with  prevalent  absenteeism,  the  volunteers  learned  the  code 
quite  easily  and  were  writing  at  an  average  rate  of  8  to  10  words  per 
minute  and  reading  at  12  words  or  more  per  minute. 

In  April  we  began  training  seven  blind  persons.  Training  averaged  20 
minutes  per  day  reading  and  20  minutes  per  day  writing.  No  practice 
between  classes  was  permitted.  All  of  these  persons  worked  at  the  Goodwill 
Center,  caning  chairs,  stuffing  mattresses,  and  doing  other  manual  work. 

In  summarizing  the  training  of  these  people  it  is  important  to  establish 
certain  facts.  Blind  persons  were  chosen  who  had  difficulty  learning  and 
using  braille.  Only  two  of  this  group  used  braille.  None  read  regularly  or 
subscribed  to  any  braille  books  or  magazines  because  of  their  limited  edu¬ 
cation  and  their  calloused  hands.  These  students  could  not  spell  very  well, 
nor  were  they  familiar  with  any  but  the  simplest  words  spelled  to  them. 
One  of  the  students,  a  deaf-mute  with  limited  tunnel  vision,  despite  his 
schooling  did  not  quite  realize  that  words  are  linked  together  to  make 
sentences.  Ste-Re  does  not  improve  one’s  literacy;  except  that  any  mental 
exercise  tends  to  improve  one’s  memory. 

Up  to  the  point  where  the  students  were  able  to  read  and  write  the 
alphabet  their  progress  was  easy  to  evaluate;  beyond  this  point  it  was 
difficult.  All  the  timing  has  been  by  two-  and  three-word  groups,  or  giving 
a  number  of  letters  of  the  alphabet  rapidly  instead  of  timed  sustained  read¬ 
ing  and  writing. 

The  following  is  a  brief  description  of  the  method  used  in  training. 
The  students  were  divided  into  teams  of  two  persons  each.  Each  team 
had  a  control  panel  and  a  nerve  response  unit.  The  members  of  each  team 


446  Man-Machine  Systems 

were  changed  with  each  session.  (This  applied  to  all  but  the  partially  blind 
deaf-mute.)  The  students  were  told  the  relationship  between  the  Ste-Re 
numbering  system  and  the  braille  numbering  system.  The  alphabet  and 
punctuation  marks  were  taught  in  order  of  frequency  of  use,  in  sections 
of  six  and  seven  characters;  each  section  was  followed  by  words  made  up 
of  the  characters  that  had  been  learned.  The  student  with  the  control  or 
writing  panel  was  shown  how  to  write  a  character;  he  in  turn  taught  his 
partner  how  to  read  the  character.  After  three  or  four  characters  were 
learned,  they  reversed  panels  and  the  person  who  wrote  learned  to  read. 
This  method  was  followed  throughout  the  training  period. 

Very  little  instruction  time  was  spent  with  each  student.  Actually  we 
had  a  difficult  time  keeping  them  from  learning  too  many  characters  at  a 
time,  the  problem  being  that  one  of  the  students  associated  the  braille 
numbering  system  and  translated  into  Ste-Re.  He  taught  his  partners  ahead 
of  schedule.  All  students  were  tested  for  potential  acuity — the  per  second 
rate  of  tactile  stimuli  that  can  be  reflectively  isolated  and  recognized  as 
being  different,  minus  all  memory  association  to  meaning  or  identification. 

Figure  10  shows  the  equipment  for  this  experiment.  The  variable  speed 
drum  rotates  at  a  known  number  of  revolutions  per  minute;  it  sends  three 
codes  per  revolution  to  a  nerve  response  unit.  By  variable  switching  it 
sends  one  code  that  is  different,  then  goes  back  to  the  original  code.  The 
student  was  asked  to  signal  when  he  recognized  that  the  code  had  changed. 
This  test  determined  his  ability  to  “feel”  differences,  and  was  continued 
until  the  subject  no  longer  knew  when  or  if  the  code  changed.  This  was 
at  the  threshhold  of  blend.  Potential  acuity  increases  with  training,  but  at 
any  given  time  no  one  can  read  faster  than  his  potential  acuity.  With 
trained  people  the  difference  between  actual  acuity  and  potential  acuity 
should  be  very  small.  If  it  is  large  the  reader  is  incapable  of  reading  faster 
because  he  is  unable  to  memorize  and  associate  what  he  feels  reflectively 
with  a  meaning. 

The  difference  between  actual  acuity  and  effective  acuity  represents  lost 
time.  Five  blind  persons  were  first  tested  at  5.0  stimuli  per  second,  and 
were  increased  to  11.0  stimuli  per  second.  Our  test  equipment  generated 
1 1  codes  per  second.  Only  one  of  the  five  subjects  reached  the  threshhold 
of  blend;  the  remaining  four  subjects  have  a  potential  Ste-Re  acuity  of 
at  least  11.0.  An  acuity  of  greater  than  10  was  not  anticipated  in  this 
early  stage  of  training;  therefore  our  equipment  was  not  designed  and 
built  for  greater  speeds.  Since  these  tests  were  completed  time  was  not 
available  for  further  modification. 

Table  2  shows  the  results  of  the  training  program.  The  average  train- 


Ste-Re 


447 


?OW£R 

Supply 


TABLE  2 

RESULTS  OF  STE-RE  TRAINING  PROGRAM 


Rates  Attained 

Training  Time — Hours  WPM 


Name 

Age 

Reading 

Writing 

Total 

Acuity 
Stim  /sec 

Writing 

Reading 

Clyburn 

42 

6.00 

6.00 

12.00 

11  + 

8 

18 

Funderburk 

25 

5.00 

5.00 

10.00 

Not  used  for  these 

tests 

Lowery 

27 

6.67 

6.67 

13.33 

11  + 

15 

24 

Mathews 

60 

7.33 

7.33 

14.67 

9 

8* 

15 

Peck 

42 

5.00 

5.00 

10.00 

11  + 

6 

15 

Shelton 

58 

7.67 

7.67 

15.33 

11  + 

17 

25 

Average 

42.3 

6.25 

6.25 

12.50 

10.6 

10.6 

19.2 

*  Includes  one  error  while  encoding 


448  Man-Machine  Systems 

ing  time  for  reading  was  6.25  hours;  the  average  training  time  for  writing 
was  6.25  hours.  The  average  words  per  minute  attained  was  10.6  in 
writing  and  19.2  in  reading.  The  alphabet  was  learned  after  7  hours  total 
training  time  (316,  hours  reading  and  3Vi  hours  writing).  The  deaf-mute 
with  tunnel  vision  is  not  included  in  these  averages;  however  he  responded 
very  well  to  the  training,  for  he  learned  and  can  use  the  entire  alphabet. 
Figure  1 1  shows  the  observed  learning  pattern  as  related  to  acuity 


Figure  1 1  Potential  Vs.  Effective  Ste-Re  Acuity  Curves 

development.  Although  each  student  will  have  his  own  pattern,  the  pat¬ 
terns  will  have  the  same  general  curves.  The  potential  acuity  curve  is  a 
relatively  smooth  constant  development  curve.  This  curve  is  the  limit 
of  one’s  ability  to  read  at  speed.  This  curve  is  plotted  against  the  learning 
curve  for  the  Ste-Re  System.  In  the  first  stage  actual  acuity  was  very  close 
to  potential  acuity.  The  stimuli  were  simple  and  the  student  found  the  few 
codes  learned  were  easily  distinguishable  one  from  another.  In  the  second 
stage  the  student  was  given  several  more  codes  and  in  his  effort  to  translate 
the  different  stimuli  into  meanings  he  developed  a  kinesthetic  “feeling.” 
This  occurred  when  the  student  said,  for  example;  “Those  were  fingers  2,  3, 
and  5,  or  thumb  and  little  finger.”  Progress  is  slow.  In  the  third  stage  more 
codes  were  added.  Here  the  kinesthetic  feeling  took  over;  we  heard, 
“Everything  feels  the  same.  I  can’t  even  tell  ‘E’  from  ‘T’  anymore.”  This 
stage  is  very  slow  and  almost  retrogresses. 

The  fourth  stage  is  the  beginning  of  good  progress.  The  student  starts 
to  reflectively  interpret  meaning  from  stimuli.  He  no  longer  tries  to  as- 


Ste-Re  449 

sociate  the  stimuli  with  certain  fingers.  He  feels,  hears,  and  knows;  he 
loses  the  sense  of  awareness  of  kinesthetic  feelings. 

In  the  fifth  stage  the  student  should  continue  to  improve  rapidly  and 
begin  to  overlap;  that  is,  he  should  be  able  to  interpret  the  meaning  of  a 
code  before  the  real  feeling  of  the  last  code  has  left.  He  begins  to  read 
off  the  points  of  stimulation.  At  the  upper  limits  of  this  stage  he  is  ap¬ 
proaching  the  potential  acuity  curve. 

The  last  stage  should  result  in  reading  from  peaks  of  stimulation  only. 
This  is  the  threshhold  of  rhythm  blend.  Substitution  of  the  feel-know  habit 
for  the  feel-hear-know  habit  occurs  during  this  stage. 

DISCUSSION 

Taking  all  factors  into  consideration,  we  believe  Ste-Re  will  be  30  to  35 
percent  faster  than  braille  book  reading.  The  average  reader  will  be  able 
to  read  without  the  use  of  contractions  at  approximately  the  same  speed 
as  a  Grade  2  English  braille  reader.  This  paper  is  not  concerned  with  the 
problems  of  contractions.  We  realize  that  without  contractions  the  braille 
book  and  the  practice  of  braille  reading  would  not  exist  as  they  do  today. 
The  introduction  to  the  Braille  Reference  Book  by  M.  S.  Loomis  (3) 
starts  out  by  explaining  the  shortcomings,  pitfalls,  and  problems  of  con¬ 
tractions.  Some  writers  defend  contractions,  others  apologize  for  contrac¬ 
tions;  few  expound  on  their  virtues.  Thus,  the  word  “gathered”  can  be 
spelled  correctly  three  ways,  but  only  one  way  is  “correct.”  Additional 
contractions  are  added  from  time  to  time;  yet  it  is  difficult  for  this  author 
to  believe  Goethe  contracted  40  percent  is  still  Goethe,  or  that  a  scientific 
or  engineering  journal  contracted  is  comparable  to  the  printed  word.  The 
engineering  of  any  automatic  translating  equipment,  therefore,  is  made 
very  difficult  and  in  some  cases  prohibitively  expensive.  Readers  do  not 
read  40  percent  faster  with  40  percent  contraction,  because  of  confusion, 
rules,  etc.,  which  also  make  learning  braille  very  difficult,  and  which  re¬ 
quire  blind  persons  to  learn  many,  many  more  rules  than  sighted  readers. 
(We  send  blind  children  to  a  regular  sighted  school  and  the  child  learns, 
by  entirely  different  rules,  a  book  or  sentence  that  only  remotely  resembles 
the  written  word.)  The  teaching  of  Grade  2  English  braille  seriously  cur¬ 
tails  the  ability  to  spell.  Contracting  is  a  very  expensive  compromise. 

This  is  not  said  defensively,  for  the  Ste-Re  System  can  be  contracted 
in  any  way — by  the  rules  of  Grade  2  braille  or  any  other  braille  system — 
just  as  any  code  language,  including  that  of  the  printed  word. 

Perhaps  the  opening  paragraph  of  this  paper  should  have  stated  a 


450  Man-Machine  Systems 

principle  aim  or  a  basic  purpose.  Our  purpose  is  to  create  a  system  which 
will  provide  the  versatility,  adaptability,  and  freedom  of  communication 
for  blind  persons  as  is  enjoyed  by  sighted  persons.  A  satisfactory  system 
must  adjust  to  the  user;  blind  persons  should  not  have  to  adjust  to  the 
equipment.  We  did  not  try  to  imitate  tools  and  methods  for  the  sighted. 

No  system  of  communicating  information  can  eliminate  the  need  for 
necessary  labeling  of  cans,  filing  systems,  and  other  objects;  however 
important  this  may  be  it  still  only  represents  a  small  portion  of  time  that 
one  spends  in  reading.  For  this  reason  we  believe  that  a  simplified  non- 
contracted  cell  can  still  be  used  for  this  purpose  so  long  as  the  numbering 
system  is  common  to  the  code  used. 

The  Ste-Re  System  is  versatile.  It  includes  a  group  of  devices  which 
provide  methods  for  writing  and  for  reading.  It  consists,  as  previously 
stated,  of  five  fundamental  units.  These  units  can  be  arranged  in  various 
ways.  The  control  panel  can  be  coupled  to  a  typewriter,  or  a  typewriter 
can  be  coupled  to  a  control  panel.  The  input  can  be  from  automatic  type¬ 
setting  tape,  wire,  direct  access  readers,  and  other  sources.  Ste-Re  can  be 
coupled  to  calculating  machines,  adding  machines,  and  other  output  devices. 
This  system  also  provides  a  means  of  making  automatically  as  many  copies 
of  a  letter  as  desired.  Ste-Re  also  can  be  transmitted  by  radio. 

Although  a  memory  disc  is  believed  to  be  superior,  a  magnetic  tape, 
punched  tape,  or  wire  can  also  be  used  as  a  recording  medium.  We  are 
of  the  opinion  that  these  forms  are  cumbersome,  difficult  to  use  and  index, 
fragile,  and  uneconomical.  The  density  character  per  square  inch  would  be 
very  low,  only  equaling  that  of  embossed  braille  cells. 

Although  the  output  from  Ste-Re  to  the  reader  is  tactile,  the  electrical 
signal  can  be  used  to  ring  bells  or  blow  whistles,  or  can  be  coupled  to  any 
output  code  method.  The  tactile  output  of  Ste-Re  can  be  presented  to 
any  part  of  the  body,  freeing  the  hands  for  other  uses  such  as  reading  and 
playing  music  at  the  same  time.  It  provides  an  easy  way  to  index  and  can 
be  automatically  or  manually  fed;  it  can  be  read  at  any  chosen  speed. 

Ste-Re  is  portable.  This,  of  course,  is  relative  and  was  determined  by 
what  today  would  be  the  most  economical  construction.  By  using  sub¬ 
miniature  parts  the  encoder  or  the  decoder  can  be  built  as  small  as  a  pack¬ 
age  of  cigarettes.  With  advances  being  made  in  electronic  circuitry  and 
design,  it  is  possible  that  the  use  of  subminiature  parts  in  a  very  few  years 
will  not  be  as  expensive  as  using  miniature  parts  today.  It  can  easily  be 
said  that  the  maximum  weight  and  dimensions  would  be  less  than  those  of 
a  portable  typewriter.  It  must  be  remembered  in  considering  the  weight 


Ste-Re 


451 


to  be  carried  that  gross  weight  is  the  important  factor.  The  abovemen- 
tioned  weight  would  include  memory  discs  which  would  contain  the  equiv¬ 
alent  of  four  to  five  heavy  volumes  of  printed  material,  and  include  a 
sufficient  number  of  blank  discs  to  enable  the  user  to  write  over  4000 
words.  In  other  words,  the  gross  weight  for  this  reading  and  writing  ma¬ 
terial  and  equipment  in  many  cases  would  not  be  as  heavy  as  the  printed 
volumes  carried  by  the  sighted  user. 

Writing  Ste-Re  is  very  rapid  and  is  comparable  to  typewriting.  Con¬ 
sidering  that  it  is  a  one-handed  operation,  a  time  and  motion  study  would 
indicate  that  it  would  be  faster  than  typing.  The  potential  reading  speed 
is  very  fast,  and  of  course  will  vary  greatly  with  the  capabilities  of  the  reader. 

The  economy  in  using  the  Ste-Re  System  is  a  little  difficult  to  analyze. 
We  estimate  that  in  mass-produced  quantities  the  reader/writer  cost  will 
be  between  $300  and  $500.  An  inexpensive  disc  for  school  work  or  cor¬ 
respondence  will  cost  about  $.05  and  contain  4000  words  relative  to  a 
writing  speed  of  50  words  per  minute.  Discs  of  a  little  better  quality,  which 
would  be  used  for  periodicals  and  other  temporary  literature,  could  be 
produced  for  $.08  to  $.10  for  20,000  words.  Fidelity  is  of  no  importance; 
therefore  a  master  can  be  made  by  a  typist,  and  in  addition  this  cost  can 
be  eliminated  by  coupling  Ste-Re  directly  to  a  typesetting  tape.  Long  lasting 
records  for  better  quality  books  might  cost  $1  to  $1.50  each.  All  of  this 
cost  is  based  on  220  groove  lines  per  inch  and  an  8-inch  diameter  disc 
using  waveforms  generated  at  2500  cps.  The  New  World  Book  Encyclo¬ 
pedia  for  the  blind  requires  approximately  43  feet  of  shelf  space.  This  in 
Ste-Re,  completely  indexed,  might  require  three  feet.  Providing  special 
braille  books  for  students  is  very  costly  and  requires  the  volunteer  services 
of  thousands  of  persons  who  must,  at  a  great  cost,  be  trained  thoroughly 
in  the  use  of  the  braillewriter.  Anyone  who  can  use  a  typewriter  can  write 
Ste-Re;  anyone  can  use  the  Ste-Re  writer  with  two  to  three  hour’s  practice. 
A  school  teacher  can  copy  a  child’s  beginning  reader  into  Ste-Re  in  a  few 
minutes.  No  teaching  expense  is  necessary  to  teach  Ste-Re;  it  can  be  self- 
taught. 


APPENDIX* 

The  most  frequently  asked  questions  about  Ste-Re  are  concerned  with  the  prob¬ 
lems  of  reading  speed. 

Some  persons  believe  strongly  that  tactile  interpretation  must  be  related  to 

*  The  material  in  this  section  is  summarized  by  the  authors  from  the  question  and 
answer  period  following  the  presentation  of  their  paper. 


452  Man-Machine  Systems 

shape,  as  opposed  to  location  and  number  as  is  used  by  the  Ste-Re  System.  We 
believe  that  even  though  braille  cell  interpretation  may  be  associated  with 
shape  primarily,  this  does  not  preclude  the  possibility  that  the  association  by 
location  and  number  can  be  as  rapid.  The  interpretation  of  braille  cells  con¬ 
taining  only  one  dot  cannot  be  by  shape;  it  must  be  by  number  and  location. 
Braille  cells  containing  two  dots  add  only  one  more  factor  to  the  number  and 
location  association,  and  that  is  direction. 

In  the  Ste-Re  System  the  hand  response  unit  illustrated  and  described  in 
the  paper  would  have  to  be  interpreted  by  stimuli  location  and  number.  An 
arm  unit  or  other  body  unit  would  also  provide  a  pattern.  The  fact  that  the 
described  hand  unit  is  a  six-point  hand  unit  does  not  mean  that  a  single  active 
braille  cell  could  not  be  used  for  interpretation.  We  did  not  choose  to  use  this 
method  as  an  illustration  because  of  the  generally  accepted  fact  that  to  interpret 
the  braile  cell  requires  finger  movement  across  the  dots.  The  tests  and  experi¬ 
mentation  that  we  conducted  indicated  that  this  is  not  required  when  separated 
areas,  such  as  the  hand  unit  described,  are  stimulated. 

The  Ste-Re  System  and  method  of  stimulation  has  been  used  and  learned 
by  those  who,  for  one  reason  or  another,  were  not  able  to  use  the  braille 
system  and  in  actual  practice  these  persons  have  been  able  to  read  Ste-Re,  with 
only  a  few  hours  of  training,  at  the  rate  of  20  words  per  minute. 

The  greatest  amount  of  time  that  we  devoted  to  experimentation  and  testing 
was  to  this  point.  The  experiments  we  made  in  acuity  development  and  per¬ 
ceptual  judgment,  as  related  to  the  Ste-Re  code,  indicated  possible  speeds  to  be 
considerably  higher  than  in  using  braille.  The  primary  reason  for  this  increased 
speed  is  related  to  the  lost  time  in  braille  reading  due  to  necessary  finger  move¬ 
ment  and  the  physical  effort  required  to  read  the  braille  code.  The  subjects 
tested  recognized  differences  in  Ste-Re  stimuli  in  excess  of  10  per  second. 

We  do  not  believe  that  whether  the  code  is  associated  with  shape  or  location 
is  of  much  significance  to  the  trained  reader.  It  is  important  to  the  piano  student 
to  know  where  to  place  his  fingers;  the  accomplished  pianist  does  not  play 
keys:  he  plays  chords  or  notes. 

A  second  group  of  questions  are  related  to  what  our  next  step  will  be  if 
we  are  permitted  to  continue  with  our  work.  The  answer  to  this  is  that  we 
intend  to  determine  more  accurately  and  under  more  controlled  conditions  the 
ability  of  persons  to  interpret  the  Ste-Re  code.  The  first  phase  of  this  work  would 
be  devoted  entirely  to  acuity  and  identification  tests.  The  purpose  here  would 
be  not  just  to  confirm  or  infirm  the  results  of  our  previous  tests,  but  to  include 
acuity  tests  as  related  to  a  single  active  braille  cell  and  other  means.  If  these 
tests  prove  successful  the  next  step  would  be  to  test  and  instruct  a  larger,  more 
select,  and  more  controlled  group  than  we  have  thus  far  been  able  to  afford,  in 
the  actual  use  of  the  Ste-Re  reader  and  writer  with  various  outputs,  as  dictated 
by  previous  acuity  experiments.  We  would  also  test  and  study  the  problems  of 
retention  of  Ste-Re  as  compared  to  braille. 

A  third  question  that  we  are  often  asked  is  as  follows:  “You  mentioned  in 
your  paper  that  the  electromechanics  could  be  built  much  smaller.  What  is  the 
possibility  of  this  and  when  do  you  think  it  will  be  possible?”  The  answer  to 


Ste-Re 


453 


this  is  that  without  changing  any  of  the  fundamental  principles  or  methods  of 
Ste-Re  the  physical  components  can  be  greatly  reduced  in  size,  simplified,  and 
made  more  economical.  This  can  be  accomplished  right  now. 

REFERENCES 

1.  Dresslar,  F.  B.,  “Studies  in  the  Psychology  of  Touch,”  Amer.  J.  Psychol.,  Vol.  6 

(1894),  pp.  313-368. 

2.  Gibson,  E.  J.,  “Improvement  in  Perceptual  Judgment  as  a  Foundation  of  Con¬ 

trolled  Practice  or  Training,”  Psychol.  Bid.,  Vol.  50  (1953),  p.  401. 

3.  Loomis,  M.  S.  Braille  Reference  Book  for  Grades  1,  IV2,  and  2.  New  York  and 

London:  Harper  &  Bros.,  1942. 

4.  Solomons,  L.,  “Discrimination  in  Cutaneous  Sensations,”  Psychol.  Rev.,  Vol.  4 

(1897),  pp.  246-250. 

5.  Tawney,  G.,  “Ueber  die  Wahrnehmung  Zweier  Punkte  mittelst  des  Tastsinnes, 

mit  Rucksicht  auf  die  Frage  der  Uebung,”  Phil.  Stud.,  Vol.  13  (1897),  pp. 
163-222. 

6.  Tichener,  E.  B.  Experimental  Psychology,  Vol.  I.  New  York:  Macmillan,  1901. 

7.  Tichener,  E.  B.  Experimental  Psychology,  Vol.  II.  New  York:  Macmillan,  1905. 

8.  Volkmann,  A.  W.,  “Ueber  Einfluss  der  Uebung,”  Leipzig  Berechte  Math.  Phys. 

Classe,  Vol.  10  (1858),  pp.  38-69. 


PROBLEMS  IN  GENERATING 


SPOKEN  OUTPUTS 

EDWARD  E.  DAVID,  JR. 

Bell  Telephone  Laboratories,  Inc.,  Murray  Hill,  New  Jersey 


A  mechanized  print  to  voice  converter  of  the  capabilities  and  dimensions  of 
a  human  reader  would  be  an  undisputed  boon  to  the  visually  handicapped. 
Such  a  development  is  presently  well  beyond  us,  and  I  am  not  convinced  of 
the  utility  of  any  device  much  more  limited  in  performance  or  less  con¬ 
veniently  packaged.  However,  there  is  at  least  one  recent  research  ac¬ 
complishment  worthy  of  mention  in  connection  with  the  print-to-voice  goal, 
namely  the  conversion  of  a  symbolic  input  to  speech.  In  this  paper  I  will 
contrast  two  approaches  to  print-to-voice  conversion  including  this  new 
factor  in  its  appropriate  context. 

I  will  assume  throughout  that  machine  readable  text  is  available  as 
input.  However,  I  can’t  resist  commenting  that  though  there  are  print 
readers  which  actually  work,  albeit  in  a  restricted  context,  they  tend  to  be 
ponderous  and  unreliable.  Let  us  hope  both  for  the  purposes  of  this  paper 
and  for  machine  retrieval  and  processing  of  literature  that  printers  and  type¬ 
writers  will  in  the  near  future  routinely  transcribe  machine  readable  tape 
along  with  visual  copy. 

The  most  direct  means  for  voice  readout  uses  prerecorded  speech  seg¬ 
ments  stored  usually  as  sound  tracks  on  a  drum  or  disc.  As  shown  in  Figure 
1,  each  recording  is  suitably  addressed  so  that  it  can  be  keyed  out  as  re¬ 
quired.  In  the  case  pictured  the  recordings  are  words  and  each  is  addressed 
by  its  English  spelling  so  that  an  input  from  a  machine  readable  equivalent 
of  the  printed  page  can  gain  access  to  the  correct  recordings  and  yield  a 
direct  conversion  to  sound. 

Prerecorded  words  and  phrases  have  been  assembled  into  sentences  for 
simple  announcements  for  many  years.  For  instance,  by  dialing  a  telephone 
number  in  certain  cities,  one  can  hear  “The  time  is  eight  forty-two.”  This 
announcement  is  synthesized  from  four  individual  utterances;  “The  time 
is,”  “eight,”  “forty,”  and  “two.”  Even  such  simple  utterances  sound  more 
natural  and  agreeable  if  they  carry  the  falling  inflection  typical  of  English 


455 


456  Mem-Machine  Systems 


PRERECORDED  STORE 
WORD 

CODES  RECORDINGS 


Figure  1  Block  Diagram  of  Voice  Readout  Using  Prerecorded  Speech 
Segments 


sentences.  Thus  the  hour  utterances  (“eight”  in  this  case)  are  taken  from 
a  different  set  of  recording  from  the  minute  utterances,  (“two”  in  this 
case).  In  more  complex  utterances  intonation  and  stress  play  a  more  im¬ 
portant  role;  namely,  they  aid  in  communicating  both  the  grammatical 
structure  (syntax)  and  the  meaning.  In  fact,  complicated  sentences  can  be 
comprehended  only  with  difficulty  unless  accompanied  by  stress  and  in¬ 
tonation  which  are  either  “neutral”  or  support  the  syntax  and  meaning. 
Recent  research  indicates  that  utterances  assembled  from  monotone  re¬ 
cordings  are  not  satisfactory  for  comprehension  (4).  Apparently  mono¬ 
tone  inflection  is  not  “neutral”  in  its  perceptual  effects.  Better  results  have 
been  obtained  by  recording  words  using  an  inflection  associated  with  their 
usual  grammatical  usage  ( 1 ) . 

To  synthesize  semantically  or  syntactically  complex  utterances  from 
recorded  words  may  require  several  versions  of  most  words — versions 
differing  subjectively  in  intonation  and  stress  and  objectively  in  duration, 
pitch,  and  intensity.  This  requirement  multiplies  the  number  of  utterances 
to  be  stored.  Just  how  many  words  and  how  many  versions  of  each  word 
would  be  required  for  any  particular  class  of  texts  is  still  a  subject  of  re¬ 
search.  Note,  too,  that  selection  of  one  from  several  available  versions  of 
a  word  involves  linguistic  analysis  of  unknown  complexity. 

There  are  these  same  uncertainties  with  prerecorded  segments  shorter 
than  words.  Experiments  with  phoneme-length  segments  have  not  pro¬ 
duced  acceptable  speech.  This  result  arises  because  the  prerecording  tech¬ 
nique  makes  no  provision  for  transitions  between  segments,  and  natural 
transitions  are  vital  to  speech  intelligibility.  This  difficulty  can  be  circum¬ 
vented  by  using  segments  composed  of  two  phonemes,  with  the  transition 
in  the  middle,  and  the  beginning  and  end  of  the  segments  at  points  which 


Generating  Spoken  Outputs  457 

are  as  nearly  steady  state  as  possible  (3).  Such  segments  have  been  called 
“dyads,”  and  intelligible  speech  has  been  demonstrated  by  piecing  them 
together.  Again,  however,  we  don’t  yet  know  just  how  many  versions  of 
each  dyad  would  be  required  for  a  realistic  reading  task.  Certainly  the  size 
of  vocabulary  and  the  complexity  of  syntax  would  be  major  determining 
factors  of  this  numerology. 

A  clever  and  workable  variant  using  prerecordings  is  so-called 
“spelled  speech.”  Here  words  are  spelled  out  using  letter  names:  A,  B, 
C,  .  .  .  ,  etc.  In  addition  to  the  difficulties  of  comprehension  already  men¬ 
tioned,  spelled  speech  is  limited  as  to  speed. 

An  alternative  to  the  prerecording  approach  uses  a  speech  synthesizer. 
There  are  several  different  kinds  of  synthesizers.  I  will  describe  a  particular 
one,  but  any  other  would  do — in  principle  at  least. 

It  has  long  been  known  that  vowel  sounds  can  be  made  artificially  by 
using  a  buzz  to  excite  a  tube  with  multiple  resonances,  simulating  the  vocal 
tract.  Electrically  this  is  equivalent  to  using  a  pulse  generator  to  excite  a 
series  of  simple  resonance  circuits  each  turned  to  a  different  frequency.  A 
three-resonance  vowel  synthesizer  is  shown  in  Figure  2.  By  setting  the 

TUNABLE  RESONANT  CIRCUITS 

ll|  270  2290  3010  CPS 

|ae|  660  1720  2410  CPS 

Figure  2  Vowel  Synthesizer 

resonant  frequencies  to  particular  values  the  various  vowels  can  be 
synthesized. 

Connected  speech  can  be  made  by  controlling  dynamically  the  resonant 
frequencies  of  a  slightly  elaborated  synthesizer  (Figure  3).  The  elaboration 
consists  of  adding  damping  controls  for  the  resonances,  and  a  hiss  or  noise 
source.  These  additions  enable  the  machine  to  utter  consonants  such  as  /S/, 
/Sh/,  /m/,  /n/,  /p/,  and  /b/.  The  proper  excitation  is  assured  by  controls 
on  the  intensity  of  hiss  and  buzz  driving  the  resonances.  All  told,  nine  con¬ 
trol  signals  are  necessary.  Note  that  the  voice  pitch  is  set  by  the  buzz  fre¬ 
quency,  while  the  intensity  and  duration  of  sounds  are  determined  by  the 
detailed  shape  of  the  control  signals.  Thus  pitch,  duration,  and  intensity 
can  all  be  controlled  independently. 

The  notion  of  using  such  a  machine  for  translating  from  printing  to 


VOWEL 

sound 


458  Man-Machine  Systems 

ab 


F0-h* 


BUZZ 


i 


mb  n 


HISS 


mh  H 


VARIABLE  TUNED  CIRCUITS 


SPEECH 


irnn 

F1  ^1  F2  °2  f3  a2 


Ah 


Figure  3  Resonance  Synthesizer 


speech  is  conceptually  simple  (Figure  4).  The  letters,  which  are  assumed 
to  be  in  machine  readable  form,  provide  the  input  to  a  translator  which 
contains  the  rules  and  data  necessary  for  supplying  the  driving  signals  to 
the  synthesizer.  Supplying  the  “guts  and  feathers”  for  this  translator  is  the 
formidable  problem.  It  involves  taking  a  discrete,  symbolic  input  and  de¬ 
riving  a  set  of  continuous  outputs  closely  related  to  speech  itself. 


CONTROL 

SIGNALS 


Figure  4  Translation  From  Printed  Word  to  Speech 


Two  of  my  colleagues,  J.  L.  Kelly,  Jr.  and  L.  J.  Gerstman  have  pro¬ 
vided  us  with  the  beginnings  of  a  solution  to  this  problem  (2).  They  have 
constructed  a  set  of  rules  whereby  synthesizer  control  signals  can  be  gen¬ 
erated  from  an  input  of  phonetic  symbols  each  of  which  carries  a  pitch  and 
duration  specification  (Figure  5).  Their  rules  assume  that  speech  can  be 
synthesized  from  a  sequence  of  constant  control  values  connected  into  a 
continuous  flow  by  transitions.  The  central  feature  is  a  stored  table  which 
for  each  input  symbol  contains  a  list  specifying  eight  control  values.  These 
represent  the  steady  states  or  target  values  which  will  be  achieved  for  each 
input  symbol.  The  pitch  values,  which  accompany  the  input  symbols,  set  the 
steady  frequency  of  the  buzz  source.  Thereby,  all  nine  controls  are  de¬ 
termined  at  a  temporal  sequence  of  points.  Two  duration  values,  also 


Generating  Spoken  Outputs  459 

supplied  with  the  input  symbols,  complete  the  specification.  One  value 
specifies  the  duration  of  each  steady  state,  while  the  second  gives  the 
duration  of  the  transition  between  steady  states.  The  transitions  themselves 
are  interpolated  according  to  rule.  Parabolic  transitions  are  used  between 
vowel  and  consonant,  while  linear  transitions  suffice  between  vowel  and 
vowel  or  consonant  and  consonant.  Thus  continuous  control  signals  are 
automatically  constructed  for  the  synthesizer. 

Speech  from  such  a  system  is  reasonably  articulate.  Its  intelligibility 
could  be  improved  if  more  effects  of  context  were  built  in.  We  know  that 
in  natural  speech  the  articulation  of  a  sound  is  profoundly  affected  by  its 
neighbors.  This  effect  would  require  that  the  steady  state  table  values  de¬ 
pend  not  only  on  the  immediate  input  symbol,  but  also  upon  the  ones  pre¬ 
ceding  the  following. 

As  to  quality,  machine  speech  sounds  reasonably  natural  provided 
that  the  pitch  and  duration  values  are  those  commonly  encountered  for  the 
particular  utterance.  For  instance,  the  score  of  a  song  provides  a  ready¬ 
made  input,  and  the  machine  is  quite  adept  at  singing. 


Bell  Telephone  Laboratories 
Incorporated 


DURATIONS  PITCH 


i _ i 


PHONETIC 

CONTROI 

* 

i 

i 

- ! - ft. 

RESONANCE 

CHARACTERS 

SYNTHESIZER 

SPEECH 


PARABOLIC  LINEAR 

TRANSITION  TRANSITION 


Figure  5  Control  of  the  Synthesizer 


460  Man-Machine  Systems 

In  terms  of  the  concept  of  a  print  to  voice  converter  using  a  speech 
synthesizer,  there  are  a  number  of  functions  yet  to  be  worked  out.  Figure 
6  is  the  same  conceptual  drawing  as  Figure  4  except  the  letter  to  control 
signal  translator  has  been  broken  down  to  accommodate  the  Kelly- 
Gerstman  machine,  which  is  shown  in  solid  lines.  The  remainder  of  the 
machine  is  dotted  to  indicate  that  it  doesn’t  yet  exist.  One  unsolved  problem 


Figure  6  Translation  from  Printed  Word  to  Speech  Utilizing  Kelly- 
Gerstman  Machine 


is  the  translation  from  letters  to  phonetic  symbols.  A  possibility  here  is  to 
derive  the  phonetic  equivalent  by  algorithm  from  its  printed  version.  So 
far  as  I  know,  no  one  has  yet  attempted  to  formulate  such  a  set  of  rules.  A 
less  speculative  proposal  is  a  dictionary  in  which  each  input  word  has  an 
entry  specifying  its  phonetic  equivalent  in  symbols.  Since  its  entries  would 
be  in  essentially  alphabetic  form,  such  a  dictionary  might  be  quite  modest 
in  its  storage  requirements  compared  to  one  storing  prerecorded  words  or 
even  synthesizer  control  signals.  This  concept,  however,  is  not  as  straight¬ 
forward  as  it  seems  on  the  surface,  for  the  phonetic  equivalent  of  a  word, 
too,  depends  upon  its  context.  Perhaps  these  interactions  could  be  accom¬ 
modated  by  a  set  of  rules  to  modify  the  symbolic  output,  particularly  at 
word  boundaries,  before  it  is  applied  to  the  Kelly-Gerstman  translator. 
However,  this  aspect  of  the  problem  has  been  too  little  researched  for 
hard  and  fast  conclusions  at  present. 

The  other  unexplored  facet  involves  automatic  generation  of  pitch  and 
duration,  or  intonation  and  stress,  from  the  input  letters.  As  we  have  said 
these  quantities  are  closely  entwined  with  grammatical,  syntactical,  and 
semantic  factors.  Therefore  one  supposes  that  some  form  of  linguistic 
analysis  must  reside  in  the  translator.  Just  how  ambitious  this  need  be  must 
be  determined  by  experiment.  Perhaps  a  highly  schematized  analysis  taking 
account  of  punctuation  only  would  suffice  for  comprehension,  but  for 


Generating  Spoken  Outputs  461 

spoken  English  in  its  full  glory  much  more  is  necessary.  Perhaps  an  in- 
between  result  could  be  had  with  the  aid  of  a  simple  “score”  accompanying 
the  input  text  and  drawn  up  by  a  human  reader. 

Again,  we  see  how  easy  it  is  to  sit  in  our  armchair  and  speculate  or 
make  learned  pronouncements.  Fortunately  it  is  becoming  not  much  more 
difficult  to  experiment.  It  is  now  possible  to  do  sophisticated  experiments 
by  simulation  using  a  digital  computer.  Nothing  need  be  built;  only  written 
programs  are  necessary.  This  method  permits  us  with  relatively  little  tech¬ 
nological  effort  to  synthesize  speech  incorporating  various  linguistic  rules. 
The  output  can  be  evaluated  subjectively,  permitting  a  truly  experimental 
approach.  Coupled  with  existing  theoretical  studies  of  language  structure, 
computer  methods  may  well  lead  us  to  a  viable  technology  of  language 
and  speech. 

In  summarizing,  let  me  compare  the  two  methods  discussed  above.  The 
prerecording  technique  is  certainly  within  present-day  technology,  though 
there  is  some  doubt  as  to  the  ultimate  naturalness  and  comprehension 
which  can  be  obtained.  Several  recorded  versions  of  most  prerecordings 
would  probably  be  required  to  handle  a  literary  or  even  journalistic  text. 
A  crucial  factor  here  is  the  vocabulary  size  and  linguistic  sophistication  of 
the  texts  to  be  handled.  In  any  case  memory  requirements  would  be  large; 
in  digital  terms  about  30,000  bits  per  second  of  speech  (though  of  course 
the  storage  would  not  necessarily  be  digital).  Using  a  speech  synthesizer 
storage  requirements  would  be  more  modest,  perhaps  50  bits  per  second. 
In  terms  of  complexity,  however,  we  must  add  the  logic  necessary  for 
look-up,  interpretation,  and  the  like.  Thus  it  is  not  clear  which  method  is 
preferable  from  the  instrumentation  standpoint.  The  synthesizer  scheme 
involves  speculative  steps,  namely  translation  from  letters  to  phonetic 
symbols  and  from  letters  or  symbols  to  pitch  and  duration.  Yet  one  crucial 
barrier,  that  of  transducing  a  symbolic  input  to  speech,  has  been  accom¬ 
plished  and  certainly  we  can  be  sure  that  the  synthesizer  method  is  intrin¬ 
sically  capable  of  producing  more  natural  and  pleasing  speech  than  can  be 
had  from  any  reasonable  library  of  pre recordings. 

In  closing,  let  me  point  out  that  today’s  technological  staples  such  as 
the  automobile  and  the  telephone  sprang  not  from  any  overwhelming  im¬ 
mediate  need  to  travel  or  communicate  but  from  the  creation  of  new 
technology.  Needs,  no  matter  how  pressing,  do  not  have  the  impact  of  new 
technological  possibilities.  Artificial  speech  from  a  program-driven  syn¬ 
thesizer  is  a  possibility  that  should  not  be  ignored  in  our  quest  to  aid  the 
visually  handicapped. 


462 


Man-Machine  Systems 
REFERENCES 


1.  Gaitenby,  J.  H.,  “Word  Reading  Device:  Experiments  on  the  Transposability  of 

Spoken  Words, ” A  cons.  Soc.  Amer.,  \  ol.  33  (1961),  p.  1664. 

2.  Kelly,  J.  L.  and  L.  J.  Gerstman,  “An  Artificial  Talker  Derived  from  a  Phonetic 

Input,”  J.  Acous.  Soc.  Amer.,  Vol.  33  ( 1961 ),  p.  835. 

3.  Peterson.  G.  E.,  W.  S-Y.  Wang,  and  E.  Sivertsen,  “Segmentation  Techniques  in 

Speech  Synthesis,”  J.  Acous.  Soc.  Amer.,  Vol.  30  (1958),  pp.  739-742. 

4.  Stowe,  A.  N.,  and  D.  B.  Hampton,  “Speech  Synthesis  with  Prerecorded  Syllables 

and  Words,”  J.  Acous.  Soc.  Amer.,  Vol.  33  (1961),  pp.  810-811. 


SUMMARY  AND  COMMENT 


OLIVER  G.  SELFRIDGE 

Lincoln  Laboratory,  Massachusetts  Institute  of  Technology,  Lexington, 
Massachusetts 


I  would  like  to  make  some  general  remarks  relevant  to  the  kinds  of  questions 
that  have  been  raised  in  this  Section. 

First  of  all,  it  is  very  clear  that  from  almost  every  point  of  view  read¬ 
ing  by  a  sighted  person  is  an  extraordinarily  different  process  from  any  of 
the  “reading  techniques”  for  the  blind.  Any  of  the  reading  techniques  pro¬ 
posed,  asserted,  or  even  hoped  for  in  the  wildest  dreams  of  some  partici¬ 
pants  will  remain  quite  different  processes,  not  just  in  speed,  but  in  all 
kinds  of  applications.  To  some  extent,  this  difference  seems  to  be  unneces¬ 
sary.  I  hope,  furthermore,  that  it  will  be  unnecessary  to  continue  this 
difference. 

Recently  there  has  been  some  discussion  in  the  public  press  of  very 
high  reading  speeds  on  the  part  of  sighted  people.  I  would  think  that  it  is 
almost  certain  that  large  claims  of  speeds  on  the  order  of  thousands  of 
words  per  minute  are  poppycock;  nevertheless  it  is  clear  that  many  sighted 
readers  can  in  fact  read  text  at  a  rate  of  about  a  thousand  words  a  minute, 
reading  it  well,  and  reading  it  with  comprehension.  The  “comprehension” 
involved  is  not  that  usually  considered,  in  which  one  reads  a  piece  of  text 
and  then  repeats  it.  It  is  semantic  comprehension.  Thus  sighted  people 
quite  commonly  read  novels  very  quickly  and  can  repeat  the  story  but  not 
the  words.  But  none  of  us  reads  poetry  at  a  thousand  words  a  minute — I 
hope!  In  the  kind  of  goals  that  most  people  have  in  mind  in  discussing 
reading  machines  for  the  blind,  they  think  what  they  want  is  a  reading 
machine. 

If  I  were  blind,  however,  that  wouldn’t  be  what  I  would  want,  because 
I  do  so  many  different  things  when  I  read.  When  I  read  poetry  it  is  just 
not  the  same  process  as  when  I  read  a  novel;  I  don’t  get  the  same  things 
out  of  them.  In  one,  I  am  conscious  of  the  words,  the  shape  of  words,  and 
the  spelling;  in  the  other  I  couldn’t  care  less.  If  I  am  reading  a  textbook, 
it  is  for  a  sense  which  is  a  much  harder  kind  of  sense  than  in  a  novel, 
especially  if  the  textbook  is  in  a  subject  that  I  don’t  know  very  much  about. 


463 


464  Man-Machine  Systems 

Here  I  may  not  be  familiar  with  the  words  used;  if  I  am  reading  a  medical 
textbook  I  may  have  to  stop  and  look  up  a  new  word,  or  I  may  have  to 
reconstruct  what  the  meaning  is.  If  I  am  reading  a  subject  which  I  feel  I 
know  something  about,  however,  say  reading  a  paper  for  review,  I  can 
skim  over  large  paragraphs,  getting  the  feeling,  “Well,  this  is  all  right;  I’ve 
sort  of  seen  this  before.” 

These  are  all  abilities  that,  with  the  goals  mentioned  in  these  papers, 
will  just  not  be  available  to  the  blind  reader. 

Here,  it  seems  to  me,  we  ought  to  raise  the  question  of  social  value. 
Not  very  many  years  ago  a  blind  person,  with  his  incredible  human  and 
intellectual  value  to  society,  was  in  effect  completely  discarded.  We  are 
beyond  that  stage  now — but  not  by  much.  We  should  not  so  easily  accept 
that  set  of  values  implicit  in  our  present  state  of  the  art  in  technology;  when 
we  do  we  are  denying  the  usefulness  to  many  valuable  people  of  the  kinds  of 
abilities  which  we  as  sighted  readers  take  for  granted.  This  is  in  fact  a 
taxation,  a  kind  of  invisible  taxation,  which  we  need  not  and  should  not 
accept.  When  I  see  the  kinds  of  things  that  blind  people  do  in  their  daily 
lives,  and  how  they  handle  the  ordinary  tasks  that  I  handle  by  reading,  I 
feel  very  strongly  that  any  sighted  person  would  be  appalled  at  the  load  he 
would  have  to  carry  to  replace  his  ability  to  read.  Just  because  of  this,  it 
would  seem  to  me  about  time  we  began  to  put  forth  a  fair-sized  effort  in 
this  area.  I  would  suggest  that  all  the  admirable  efforts  we  have  read  in  the 
preceding  papers  represent  a  national  effort  which  in  fact  is  far  below 
threshold. 

All  the  techniques  mentioned  in  the  papers  have  an  upper  limit  of 
reading  speed  of  about  100  words  per  minute.  This  order  might  be  50 
percent  higher,  although  those  who  usually  speak  of  its  being  50  percent 
higher  tend  to  “fudge”  a  little.  Now,  there  is  nothing  magical  in  reading 
speed  per  se,  but  it  is  surely  an  important  symptom.  A  speed  of  100  words  per 
minute  is  an  order  of  magnitude  below  what  the  sighted  reader  of  these 
pages  can  attain.  An  order  of  magnitude  is  a  very  large  difference.  In 
electrical  engineering  an  order  of  magnitude  means  another  stage  of  gain. 
In  practice  an  order  of  magnitude  is  much  more  than  that:  a  mere  3  db 
difference  in  pay,  for  example,  makes  a  lot  of  difference,  and  that  is  only 
a  third  of  one  order  of  magnitude! 

Then  there  is  the  matter  of  flexibility  in  use.  The  sighted  reader  can 
scan  to  look  for  something  he  is  interested  in.  He  can,  so  to  speak,  insert 
a  “filter”  between  himself  and  what  he  reads.  Some  people  (though  not 
you  nor  I,  gentle  reader)  have  been  known  to  look  for  their  own  names  in 


Summary  and  Comment  465 

references;  it  is  possible  in  fact  to  go  through  a  written  paper  looking  at  a 
page  every  four  or  five  seconds  for  one’s  name,  and  it  will  stand  out.  This 
is  reading  which  goes  much  faster  than  a  thousand  words  per  minute,  and 
yet  it  is  fairly  reliable.  This  kind  of  technique  is  absolutely  inaccessible 
with  any  of  the  sequential  techniques  of  presentation  we  have  had  pre¬ 
sented  here.  Another  factor:  probably  one  of  the  most  useful  abilities  we 
have  among  sighted  readers  is  the  ability  of  the  eye  to  jump,  to  glance  and 
take  in  a  certain  set  of  words  and  then  to  glance  somewhere  else.  This  is 
not  sequential  in  operation.  I  am  not  a  psychologist,  but  it  strikes  me  that 
the  match  between  the  printed  page  and  the  eye  depends  very  much  on  the 
ability  of  the  eye  to  jump  over  small  or  large  distances.  Somehow  this  kind 
of  ability  ought  to  be  provided  through  the  other  senses.  Not  being  a  psy¬ 
chologist,  I  have  no  idea  of  how  to  go  about  this;  it  is  a  querulous  com¬ 
plaint  of  mine. 

We  have  also  said  that  the  blind  have  to  read.  I  grant  this,  although  as 
I  have  said  I  think  that  reading  is  several  different  things,  depending  on  one’s 
purpose.  But  it  is  still  not  clear  to  me  what  the  purpose  of  the  access  is  at 
all.  I  probably  spend  half  of  my  own  time  reading  for  information,  and  the 
other  of  my  time  for  entertainment.  Surely  these  are  very  different  kinds  of 
purposes,  and  they  will  probably  require  different  means  of  technical  ex¬ 
pression. 

Now  let  me  comment  a  little  on  the  papers  as  I  heard  them. 

The  first  four  papers  in  this  Section  all  referred  to  braille  in  one  way 
or  another;  the  rest  of  the  papers  referred  to  contact  with  the  human  user 
by  way  of  speech.  Dr.  Mann  said  that  he  was  concerned  with  “.  .  .  the 
problem  of  getting  more  braille  to  the  blind.”  He  said  this,  of  course,  in 
the  context  of  questioning  the  very  beginnings  of  braille,  of  the  real  pur¬ 
pose  of  braille.  As  I  understand  it,  braille  is  now  about  100  years  old;  its 
habits  and  its  compromises  have  been  set  up  and  established  without — 
necessarily  without — any  real  understanding  of  what  is  technologically 
possible,  and  I  think  probably  without  any  real  understanding  of  what  is 
humanly  possible. 

In  answer  to  one  question,  Dr.  Yngve  said,  “.  .  .  you  would  like  to 
compromise  certain  of  the  rules  of  braille  so  as  to  make  it  possible  to 
mechanize  some  of  the  translation  difficulties  that  you  would  otherwise 
encounter,”  and  this  statement  initiated  a  general  discussion.  In  that  dis¬ 
cussion  it  was  stated  that  the  rules  of  braille  were  made  for  man  and  rather 
than  remaking  them  for  the  machine  we  ought  to  prefer  the  man.  Yet  the 
rules  of  braille  are  not  in  themselves  anything  except  insofar  as  they  are 


466  Man-Machine  Systems 

useful  to  the  men  who  use  them.  The  “rules  of  braille,”  as  they  are  set  up, 
are  very  unfortunate  in  this  context  precisely  because  they  quantize  what  is 
essentially  a  very  continuous  thing  for  the  rest  of  us.  There  are  not  three 
levels  of  reading  for  people  who  read  books;  there  is  a  very  smooth  con¬ 
tinuum  in  which  the  child  learns  to  read  books  by  going  little  by  little,  the 
tasks  getting  tougher  and  tougher.  No  five-year-old  can  read  medical  texts. 
Yet  he  can  go  from  elementary  reading  to  advanced  reading  because  he  can 
go  smoothly  and  continuously,  enlarging  his  vocabulary  bit  by  bit.  We 
avoid  Reading  1,  Reading  IV2 ,  and  Reading  2.  It  is  a  shame  this  is  not 
possible  in  braille,  since  it  means  that  someone  is  stuck  unless  he  takes  a 
course  to  learn  to  read  Grade  2  braille. 

There  is  nothing  inherently  wrong  with  contractions,  of  course;  a  secre¬ 
tary  uses  them  all  the  time.  If  she  takes  shorthand  or  stenotyping,  she  may 
increase  or  decrease  the  amount  of  contraction  in  English,  depending  on 
her  skill,  her  experience,  the  context,  her  particular  job,  and  so  on.  Thus 
by  establishing  a  set  of  more  or  less  inviolable  rules  by  which  braille  has 
to  be  handled,  we  are  rather  straight-jacketing  ourselves  when  we  don’t 
even  have  to,  at  this  stage. 

I’d  like  to  raise  the  general  point  that  there  is  a  place  here  for  looking 
at  braille  as  it  matches  to  the  real  abilities  of  people.  I  don’t  mean  by  this 
braille  as  a  tactile  exercise  in  this  case;  I  mean  braille  as  a  communicative 
encoding  medium. 

Some  other  papers  were  concerned  with  the  management  of  texts  and 
were  excellently  presented.  Mr.  Zickel  has  had  to  be  concerned  in  his 
business  with  points  of  trivia,  actually,  that  we  ought  not  to  have  to  be  con¬ 
cerned  with.  There  ought  to  be  enough  money  so  that  the  exact  cost  per 
page,  and  the  expenses  of  the  metal  for  embossing,  weren’t  really  the 
effective  decisions  we  have  to  make.  Even  in  Dr.  Mann’s  paper,  I  felt  that 
the  administrative  difficulties  in  handling  paper  punched  tape  really  ought 
not  to  be  the  deciding  difficulties  in  getting  braille  books.  If  braille  books 
are  the  best  way  to  communicate — and  I  might  say  that  I  am  not  sure  they 
are — it  ought  not  to  be  a  question  of  how  to  handle  punched  tapes  that 
decides  the  issues.  It  is  now,  and  it  strikes  me  that  this  is  a  great  shame. 

Then  Mr.  Stevenson  spoke  about  Ste-Re,  which  is  at  least  concerned 
with  writing.  Being  a  simple-minded  mathematician  about  this,  I  will  guess 
that  perhaps  multidimensional  signals  will  be  capable  of  somewhat  higher 
speeds  than  we  spoke  of  before.  It  turns  out  here  that  the  number  of 
distinct  recognitions  per  second  (a  rather  involved  issue,  to  be  sure)  is  on 
the  order  of  ten  per  second,  apparently.  This  means  about  70  words  per 


Summary  and  Comment  467 

minute.  I  have  seen  a  person  who  reads  at  this  speed  with  Morse  code; 
the  same  sort  of  thing  is  what  he  anticipates  for  Ste-Re.  I  must  say  that  I 
admire  what  strikes  me  as  his  incredible  assurance  when  claiming  one  can 
reach  a  speed  five  times  higher  than  the  maximum  he  shows  in  his  Figures. 
I  wish  him  all  luck. 

Mr.  Dersch  has  shown  us  an  interesting  machine  which  is  probably 
very  useful,  but  I  think  he  is  not  really  being  entirely  fair  to  the  rest  of  the 
speech  recognition  business  in  calling  it  a  speech  recognition  machine.  A 
vocabulary  of  20  words  is  not  necessarily  easily  improved  over  a  vocabulary 
of  several  hundred  words.  One  well-known  speech  recognition  machine  has 
a  vocabulary  of  200  words,  and  I  myself  would  not  be  satisfied  with  that  as 
a  means  of  communication.  A  vocabulary  of  only  20  words  is  rather  like 
another  set  of  pushbuttons,  as  someone  in  the  audience  remarked.  Some 
time  ago,  in  a  study  we  were  doing  for  the  Signal  Corps  at  Cape  Cod  during 
the  summer,  I  jocularly  proposed  that  communications  between  tanks 
would  become  very  easy  if  a  tank  driver  could  learn  to  run  a  teletype  with 
his  toes.  (The  idea  was  that  his  toes  would  move  within  his  large  boot;  the 
teletype  keys  were  underneath  the  toes,  which  he  could  learn  to  move  one 
at  a  time;  and  the  other  foot  could  be  used  for  receiving  messages  back.) 
In  hearing  about  the  Ste-Re  I  immediately  thought  about  the  toes;  actually, 
however,  the  communication  rates  with  the  Stevenson  device  will  be  rather 
higher  than  with  the  Dersch  device.  This  is  not  speech  recognition  because 
the  vocabulary  is  really  very  limited  (it  is  an  accomplishment  that  it  is 
limited,  but  it  is  still  limited  for  this  purpose).  In  addition,  the  segmentation 
problem,  which  is  a  very  large  part  of  the  total  problems  of  speech  recogni¬ 
tion,  is  avoided  by  insisting  on  artificial  segmentation.  Dersch  insists  that  peo¬ 
ple  end  the  word  “eight”  with  a  “t,”  whereas  I  don’t  see  why  he  didn’t  merely 
deny  the  “t,”  because  experimentally  most  people  don’t  add  the  “t”  at  the 
end  of  the  word  “eight.”  The  information  rate  with  this  system  is  rather 
lower  than  that  apparently  available  with  the  other  systems. 

The  real  problems  of  speech  recognition  were  presented  well  by  Dr. 
David,  but  the  chance  for  their  solution  does  not  strike  me  as  enormously 
optimistic  for  the  short  run.  In  the  long  run — by  which  I  mean  five  years 
or  more — it  is  optimistic.  At  this  point,  it  seems  to  be  about  time  to  con¬ 
sider  how  much  of  the  problem  of  reading  can  be  solved  by  a  machine  which 
reads  aloud  to  the  user.  I  found  myself  puzzled  by  the  long  general  dis¬ 
cussion  of  intonation,  because  when  I  read  a  book  I  do  not  see  intonation 
in  the  printed  page,  nor  do  I  find  myself  hindered  by  not  seeing  it  there. 
One  of  my  querulous  complaints  is  that  the  spectrum  (not  a  single  set)  of 


468  Man-Machine  Systems 

the  problems  of  reading,  of  the  requirements  for  reading,  have  not  been 
delineated.  These  requirements  are  not  available  so  that  people  can  work 
on  them;  they  must  be  re-interpreted  by  each  person  working  in  the  field. 
This  is  a  real  drawback  because  it  means  that  every  time  one  gives  a  talk 
one  has  to  specify  what  one’s  requirements  are;  one  cannot  take  them  for 
granted.  I  think  it  would  be  well  to  look  at  the  actual  habits  of  reading, 
as  have  some  of  the  papers  in  this  Panel,  to  find  out  what  the  real  require¬ 
ments  are.  This  kind  of  study  is  often  called  “operations  research,”  and 
although  I  hate  to  use  the  term,  I  think  this  is  the  kind  of  study  needed  be¬ 
fore  we  are  to  get  above  threshold  in  this  area.  At  today’s  stage  of  tech¬ 
nology,  I  think  we  really  need  to  interpret  our  social  values — and  what  they 
should  be — to  get  above  threshold. 

Dr.  David  paid  homage  to  Paget;  I  would  like  to  pay  homage  to  a 
machine  which  used  no  transistors  at  all.  It  was  called  a  “butterfat  recog¬ 
nizer,”  and  it  was  built  in  J.  C.  R.  Licklider’s  group  some  years  ago,  in 
the  late  ’forties.  This  device  didn’t  have  to  bother  with  segmentation;  any 
time  you  used  the  word  “butterfat”  in  conversation — although  it  doesn’t 
arise  terribly  often,  I  know — the  thing  would  light  up,  and  do  other  things, 
too,  if  you  wanted  it  to.  The  trouble  with  this  machine — and  this  illustrates 
some  of  the  difficulties  Dr.  David  brought  up — is  that  it  doesn’t  make  an 
adequate  kind  of  distinction  for  useful  voice  recognition,  which  is  in  fact 
a  very  hard  problem.  As  one  ordinarily  talks  about  speech  recognition, 
one  is  concerned  ultimately  with  conversational  speech,  or  with  recording 
spoken  speech  as  it  is  ordinarily  spoken  without  training.  I  would  like  to  see 
what  kind  of  semantic  rates  can  be  achieved  with  sound,  using  the  kind 
of  multidimensional  presentations  that  have  not  been  used  so  far.  Ac¬ 
cording  to  the  sonar  psychologists  there  seem  to  be  different  kinds  of  di¬ 
mensions  of  distinguishability  that  we  know  exist  in  sounds.  I  would  like  to 
see  what  kinds  of  rates  of  appreciation  could  be  achieved  with  training.  I  am 
concerned  here  with  fast  reading,  primarily.  This  is  not  the  same  thing  as 
looking  up  telephone  numbers  in  a  book,  in  which  case  we  don’t  need  such 
fast  rates  of  reading. 

The  problem  of  reading,  as  a  whole,  has  not  been  specified,  nor  even 
looked  at,  adequately  enough  for  us,  as  scientists,  to  be  very  proud  of  where 
we  stand  in  this  field  today. 


Panel  V— Plenary  Session 


INTRODUCTION 


CHARLES  HEDKVIST 

De  Blindas  Forening,  Stockholm,  Sweden 


It  has  been  said  that  blindness  is,  in  one  view,  a  technical  handicap.  The 
day  we  find  a  method  or  an  instrument  with  the  help  of  which  the  blind  can 
get  as  clear  an  idea  of  the  world  around  them  as  the  sighted,  blindness  will 
no  longer  constitute  a  problem.  Psychological  and  other  effects  of  blindness 
consequently  should  be  connected  with  our  inability  to  solve  all  technical 
problems  involved  in  the  handicap. 

Even  if  this  statement  of  the  problem  is  somewhat  simplified,  I  believe 
that  it  is  right  in  principle;  if  so,  it  is  easy  to  believe  that  this  Congress  may 
be  the  beginning  of  a  new  epoch  in  the  history  of  the  blind,  an  epoch  of 
systematic  and  rational  research  on  one  of  the  most  important  fields  in  the 
effort  to  overcome  blindness  as  a  handicap.  Therefore,  we  also  have  every 
reason  to  express  sincere  thanks  to  the  American  Foundation  for  the  Blind 
and  other  institutions  in  the  U.S.A.  which  made  this  Congress  possible. 

I  also  would  like  to  compliment  the  American  Foundation  for  the 
Blind  for  having  realized  that  this  area  demands  full  international  coopera¬ 
tion  and  that,  as  a  logical  consequence  of  this,  it  has  asked  for  close  col¬ 
laboration  with  the  World  Council  for  the  Welfare  of  the  Blind. 

This  Congress  had,  as  I  understood  it,  two  purposes.  The  first  four 
days  of  the  Conference  focused  attention  on  the  first  purpose:  an  attempt 
to  establish  where  we  stand  today  as  far  as  technology  and  blindness  are 
concerned.  It  is  an  impressive  statement.  One  also  has  a  strong  feeling 
of  unsatisf acton*  or  infrequent  contact  among  scientists  in  different  coun¬ 
tries  which  leads  to  waste  of  both  economic  and  personnel  resources. 

The  last  day  of  the  Congress  gave  attention  to  the  second  purpose;  to 
take  a  position  on  the  form  of  continued  and  effective  international  co¬ 
operation.  We  did  not  require  that  the  Congress  come  to  a  final  decision 
as  far  as  the  organization  of  such  cooperation  is  concerned;  but  immediately 
after  this  Panel  closed,  the  Technical  Subcommittee  of  the  World  Council 
for  the  Welfare  of  the  Blind  met  in  order  to  decide  on  preliminary  plans 
for  continued  international  cooperation  in  consultation  with  various  in- 


471 


472  Plenary  Session 

terested  organizations  and  institutions.  The  discussion  during  Panel  V  of 
this  Congress  was  naturally  of  the  greatest  value. 

In  what  follows,  we  include  some  general  papers  relating  to  technologi¬ 
cal  research.  The  summary  statements  prepared  originally  for,  and  de¬ 
livered  in,  Panel  V  have  been  incorporated  into  the  Introductions  to  Panels 
I  through  IV  in  this  published  version  of  the  Proceedings.  Dr.  Josephson’s 
paper  was  presented  originally  at  a  special  joint  session  of  Panels  I  and  II 
on  the  first  day  of  the  Congress. 


SUMMARY  OF  PROCEEDINGS  OF  THE 
INTERNATIONAL  CONGRESS  ON 
TECHNOLOGY  AND  BLINDNESS 

JOHN  K.  DUPRESS 

American  Foundation  for  the  Blind ,  New  York,  New  York 


The  various  Panels  and  Sections  of  this  Congress  reflect  the  orientation  of 
the  field  to  problems  which  arise  in  the  performance  of  simple  and  complex 
tasks  with  impaired  vision  or  total  blindness.  The  papers  which  were  pre¬ 
sented  on  instrumentation  research  indicate  attempts  at  specific  solutions. 
It  is  natural  that  this  problem-and-solution  orientation  should  pervade  our 
thinking,  although  it  has  both  strong  advantages  and  disadvantages.  Specific 
proposals  have  been  funded  and  design  criteria  selected  which  promise  at 
least  a  partial  solution  for  each  major  problem.  On  the  other  hand,  the 
premise  that  instrumentation  could  substitute  for  sensory  losses  has  caused 
general  neglect  of  human  engineering  or  the  man-machine  interface;  more 
importantly,  it  has  de-emphasized  the  importance  of  the  human  and  his 
remaining  capabilities.  That  this  quite  restricted  point  of  view  is  now  being 
broadened  is  reflected  for  example  in  the  Panel  on  Living  Systems.*  In  the 
Living  Systems  Panel  we  have  considered  many  aspects  of  man’s  remaining 
sensory  capabilities  and  also  his  unique  mental  processes. 

Let  us  review  briefly  the  attempts  which  have  been  made  to  solve  the 
three  major  problems  in  this  field.  These  problems  are  direct  and  indirect 
access  to  the  printed  word  and  graphic  forms;  mobility;  and  the  maximum 
utilization  of  remaining  sensory  channels.  A  fourth  area,  which  was  omitted 
from  the  Congress  because  there  is  no  current  research,  is  direct  access  to 
the  spoken  word  for  the  deaf-blind. 

The  choice  of  the  terms  “direct  access”  and  “indirect  access”  is  espe¬ 
cially  important  in  highlighting  the  amount  of  sighted  human  intervention 
involved  and  the  requirements  for  the  production  of  special  materials.  The 
degree  to  which  a  blind  person  is  handicapped  is  proportional  to  his  de¬ 
pendence  on  others  for  total  interaction  with  his  environment.  A  reading 

*  Volume  II  of  these  Proceedings. 


473 


474  Plenary  Session 

machine  or  a  typesetting  tape  code  converter  may  open  up  the  whole  world 
of  stored  human  intelligence  for  the  blind  individual.  Until  such  devices 
are  available  to  blind  persons,  however,  they  must  depend  upon  the  indirect 
routes  of  sound  recording  media  and  braille. 

Although  the  first  optical  image  to  sound  conversion  device  was  built 
at  Cambridge  University  prior  to  World  War  I — and  there  has  been  a  con¬ 
siderable  history  of  research  in  America  starting  with  World  War  II  and 
continuing  under  the  substantial  Veterans  Administration  program — we 
are  still  some  distance  from  a  practical  reading  machine  which  any  blind 
person  may  have  in  his  home.  The  present  state  of  the  art  limits  us  to 
personal  reading  machines  which  permit  now  only  much  slower  compre¬ 
hension  rates  than  are  possible  with  a  sighted  transcriber  generating  sound 
recording  media  or  Grade  2  braille.  The  alternatives  to  the  Battelle  Reader 
and  Optophone  are  the  Letter  Recognition  Machine  and  the  Type-Setting 
Tape  to  Spoken  Word  Machine.*  Both  types  will  require  several  additional 
years  of  research  to  perfect.  Although  the  future  does  not  seem  highly 
promising  in  the  area  of  reading  machines,  the  research  must  continue; 
even  so,  we  can  expect  only  a  very  small  number  of  blind  persons  to  derive 
benefit  from  direct  access  reading  machines  for  specialized  needs.  The 
highly  varied  needs  of  blind  people  in  their  educational,  vocational,  and 
cultural  pursuits  can  alone  inspire  and  justify  continued  endeavor. 

A  number  of  mobility  devices  which  detect  objects  by  electromagnetic 
waves  or  sound  emission  have  been  constructed.**  The  most  sophisticated 
version  detects  more  than  95  percent  of  objects  most  commonly  encoun¬ 
tered  in  travel.  Terrain  changes  in  the  form  of  curbs  and  step-downs  have 
proven  to  be  more  difficult  to  detect.  Although  two  laboratory  prototypes 
have  been  designed  which  warn  the  user  of  terrain  changes  as  small  as  two 
inches  under  nearly  all  operating  conditions,  much  work  needs  to  be 
done  before  they  become  compact,  lightweight,  and  reliable.  Furthermore, 
the  precise  information  provided  by  the  cane  is  indispensable  for  accurate 
vertical  movement  up  or  down  stairs.  Thus  far,  the  important  areas  of 
orientation  and  navigation  remain  unaided  by  substantial  innovations, 
and  even  without  prototype  development.  It  can  be  said  that  all  the 
devices  currently  under  development  will  be  useful  only  to  the  individual 
who  has  highly  trained  mobility  capability  and  experience.  In  addition, 
he  must  utilize  to  the  utmost  his  remaining  senses  and  mental  processes. 


*  See  the  papers  in  Panel  I,  Section  2  (this  volume). 

**  See  Panel  I,  Section  1  (this  volume). 


Summary  475 

As  of  the  present  time,  the  cane,  the  dog  guide,  and  the  human  guide  are 
unchallenged  as  fail-safe  aids  for  complex  and  dangerous  situations. 

In  spite  of  the  fact  that  braille  has  been  a  useful  indirect  access  route 
for  more  than  one  hundred  years,  only  about  3  percent  of  the  blind  popu¬ 
lation  read  one  or  more  braille  books  per  year;  less  than  25  percent  of 
all  blind  people  learn  braille.  For  those  who  must  secure  an  education  as 
blind  children,  or  whose  employment  requires  extensive  notes,  braille  is 
an  indispensable  tool.  Recent  research  and  development  has  made  it 
feasible  for  the  computer  and  automation  to  facilitate  the  production  of 
braille  material.*  We  are  also  in  the  era  of  teaching  machines  which  may 
soon  contribute  to  braille  learning.** 

A  second  indirect  access  route  is  sound  recording  media.  Commercial 
development  has  facilitated  the  flow  of  information  to  the  blind  through 
long  playing  Talking  Books  and  more  recently  the  tape  recorder.  The 
significant  cost  factor  cannot  be  overlooked  in  providing  special  material 
for  handicapped  individuals.  Technology  and  industry  will  continue  to 
reduce  the  costs  of  individual  books  converted  to  braille,  tapes,  and  discs. 
There  is  a  substantial  body  of  volunteers  when  computers  are  not  economi¬ 
cally  feasible.*** 

In  the  panel  on  Living  Systems,  a  number  of  papers  were  presented 
which  explore  the  performance  of  tasks  by  living  beings  other  than  man.t 
The  researchers  in  this  panel  also  discussed  the  organization  and  structure 
of  the  human  sensory  system  and  certain  aspects  of  human  data  processing. 
The  beginning  of  research  on  sensory  testing  and  training  for  the  blind 
has  also  been  explored.  It  cannot  be  emphasized  too  strongly  that  until 
such  time  as  we  have  substitutes  for  vision,  or  until  vision  is  restored  by 
organ  transplants,  or  until  data  is  introduced  directly  into  the  central 
nervous  system,  we  must  depend  upon  the  blind  or  deaf-blind  person  to 
perform  most  or  all  of  the  tasks.  It  is  unrealistic  to  assume  that  each 
blind  person  will  have  innumerable  devices,  each  designed  to  be  useful  in 
the  performance  of  one  task  only.  Where  possible,  the  blind  person  should 
be  able  to  use  the  wide  variety  of  devices  already  in  existence  for  his  sighted 
fellowmen.  Too  many  special  devices  for  the  blind  enhance  the  attitudes 
of  difference  and  dependence.  We  must  emphasize  instead  the  training  of 


*  See  Panel  I.  Section  3  (this  volume). 

**  See  Panel  IV,  Section  4  (Volume  III  of  these  Proceedings) . 

***  See  the  papers  of  Panel  III  (Volume  III  of  these  Proceedings). 
t  See  the  papers  in  Panel  I.  Section  1  (this  volume),  and  Volume  II  of  these 
Proceedings. 


476  Plenary  Session 

the  individual  to  maximize  his  capability  while  we  supplement  this  capa¬ 
bility  in  general  rather  than  in  limited  ways.  Adapted  and  special  purpose 
devices  will  continue  to  be  required,  but  they  are  secondary  to  techniques, 
and  are  interim  solutions  until  we  can  provide  more  general  ones. 

In  evaluating  progress  to  date,  let  us  consider  the  resources  available 
to  the  solution  of  the  problems.  As  of  the  fiscal  year  1961-1962,  more 
than  97  percent  of  all  funds  allotted  for  sensory  aids  research  or  sensory 
training  and  testing  research  for  the  blind  in  this  country,  was  provided 
by  government  agencies.  All  of  the  research  is  being  done  at  universities 
and  not-for-profit  laboratories.  Of  all  the  organizations  for  the  blind  in 
the  United  States,  only  one  has  any  full-time  staff  in  the  area  of  tech¬ 
nological  research;  that  is  in  the  American  Foundation  for  the  Blind, 
with  only  one  staff  member  and  an  assistant.  Most  of  the  sensory  aids 
researchers  are  part-time  and  have  academic  and  other  research  commit¬ 
ments.  There  are  14  projects  currently  active.  Five  are  in  the  reading 
machine  area;  three  deal  with  braille  computer  transcription  and  general 
braille  research;  two  in  mobility  devices;  three  in  sound  recording  media; 
and  one  in  general  basic  sensory  aids  research  for  the  blind. 

Although  the  United  States  has  the  largest  program,  researchers  in 
other  countries  have  demonstrated  that  they  are  engaged  in  efforts  of 
equal  excellence.  Sweden  has  an  ambient  mobility  device  and  a  naviga¬ 
tional  aid;  England  is  field  testing  an  ultrasonic  mobility  device  and  is 
doing  research  in  the  psychophysical  reading  machine  area;  Russia  has 
reported  on  one  reading  machine;  and  Poland  on  a  mobility  device.  In 
Canada  researchers  are  working  in  their  spare  time  on  a  navigational  aid, 
and  are  engaged  in  some  psychophysical  studies.  At  least  a  half-dozen 
countries,  including  the  United  States,  have  programs  to  adapt  and  design 
special  purpose  devices. 

One  of  the  main  factors  which  has  limited  progress  in  this  field  is 
component  and  systems  development.  With  limited  staff  and  funds  progress 
has  been  slow,  and  improvements  frequently  wait  upon  military  and  in¬ 
dustrial  developments.  Assistance  can  be  provided  outside  the  area  of 
research  for  the  blind  because  we  have  problems  in  common.  For  example, 
the  military  is  extremely  interested  in  nonvisual  object  detection;  industry 
and  government  are  spending  large  sums  of  money  on  character  recogni¬ 
tion  machines;  numerous  organizations  are  trying  to  measure  and  train 
humans  for  educational  and  vocational  pursuits.  In  spite  of  this  common 
ground,  however,  there  are  important  differences.  The  data  processing 
industry  demands  extreme  accuracy  in  character  recognition  at  high 
speeds.  Unlike  the  blind  population  this  industry  can  use  special  type 


Summary  477 

fonts  and  can  pay  large  sums  of  money  for  central  facilities.  The  blind 
person  can  provide  most  of  the  intelligence  needed,  and  he  can  tolerate 
the  machine  working  at  a  slow  rate  of  speed  and  with  a  substantial  num¬ 
ber  of  errors.  The  data  processing  industry  and  the  blind  population, 
therefore,  will  want  quite  different  machines.  Components  and  systems 
can  be  borrowed  nevertheless  from  industry. 

When  the  nation  depends  on  radar,  sonar,  and  infrared  research  for 
its  survival,  billions  of  dollars  are  spent.  In  most  cases  the  search  area 
of  the  object  detectors  is  relatively  uncluttered  and  only  one  or  at  most  a 
few  targets  must  be  located.  The  equipment  can  be  large  and  expensive. 
Tremendous  amounts  of  energy  expenditure  and  low  efficiency  can  be 
tolerated.  In  contrast,  the  blind  person  walking  about  an  environment 
cluttered  with  objects  and  with  terrain  changes  at  close  range  has  a  much 
more  difficult  target  discrimination  problem.  Moreover,  his  equipment 
must  be  compact,  light,  fail-safe,  and  relatively  inexpensive  to  purchase 
and  maintain.  Fortunately,  some  of  the  principles  of  active  or  passive 
energy  radiating  devices  for  the  military  may  be  applied  to  research  for 
the  blind.  In  general  we  can  benefit  from  military  research  only  in  cer¬ 
tain  components  and  systems.  There  remains  a  considerable  amount  of 
special  development  needed  for  sensory  aids. 

Psychological,  intelligence,  and  sensory  testing  is  another  rapidly  ex¬ 
panding  field  in  universities  and  in  industry.  As  these  tests  are  used  more 
widely  and  are  validated  it  remains  only  for  our  field  to  borrow  many  of 
these  tests,  modify  them  for  presentation  through  other  than  the  visual 
channel,  and  determine  if  they  are  valid  for  the  blind  and  deaf-blind.  There 
still  remain,  however,  some  special  sensory  tests  not  considered  necessary 
for  use  with  the  normal  individual.  One  example  is  a  thorough  analysis  of 
cutaneous  sensitivity  thresholds,  which  are  very  important  to  the  deaf- 
blind  because  this  is  their  only  remaining  sensory  channel  of  any  potential 
real  usefulness. 

It  might  be  well  to  enumerate  for  the  record  what  I  see  as  the  major 
ingredients  which  are  essential  in  the  solution  of  any  of  the  problems 
affecting  the  blind  and  deaf-blind.  They  are  the  following. 

1 )  Researchers  in  a  number  of  disciplines,  in  some  cases  with  inter¬ 
disciplinary  training,  should  be  challenged  to  enter  the  field  of  sensory 
deprivation  with  commitments  beyond  the  one-  to  three-year  project. 

2)  There  should  be  concurrent  long  range  planning  which  includes 
a  broad  program  or  research  center  approach  similar  to  that  of  the 
Committee  on  Sensory  Devices,  but  expanded  to  include  human  engi- 


478  Plenan' Session 

neering  and  the  behavioral  sciences.  Although  it  can  be  truthfully  said 
that  one  of  these  centers  exists  at  MIT.  it  is  our  hope  that  several  more 
can  be  established  in  other  parts  of  this  country — and  possibly  abroad. 
As  long  as  research  centers  are  university  based  and  there  is  close  co¬ 
operation  with  service  agencies  for  the  blind,  we  can  train  scientists  in 
sensor}'  deprivation  as  we  train  them  for  their  primary  skills  in  physical 
and  behavioral  sciences. 

3)  In  order  that  any  long  range  plan  may  succeed  there  must  be 
coordination  and  consultation  among  government  agencies  (which  pro¬ 
vide  the  funds),  universities  and  private  laboratories  (which  actually 
do  the  research),  and  service  agencies  for  the  blind  (which  can  bring 
the  results  of  research  into  the  lives  of  blind  people).  In  addition  there 
must  be  one  or  more  central  data  collection,  processing,  and  dissemina¬ 
tion  groups  which  specialize  in  the  area  of  the  blind  and  deaf-blind. 

4)  .Although  nearly  all  of  the  financial  support  for  research  in  this 
field  must  continue  to  come  from  government  agencies,  it  is  the  obliga¬ 
tion  of  research  laboratories  and  service  agencies  to  improve  the  quality 
of  research  and  to  emphasize  general  rather  than  specific  solutions  which 
may  also  be  of  value  to  persons  other  than  the  blind  and  deaf-blind. 
As  research  costs  increase,  and  we  become  better  aware  of  the  kinds 
of  research  which  needs  to  be  done,  the  support  level  must  rise  rapidly. 
In  the  technological  area  the  present  annual  funding  in  the  United  States 
of  about  $500,000  per  year  must  multiply  fourfold  within  three  to  five 
years  if  we  expect  to  achieve  terminal  results.  There  should  be  a  similar 
increase  abroad. 

5)  In  the  final  analysis  our  single  objective  is  to  improve  the  lives 
of  the  blind  and  deaf-blind  people.  The  major  organizations  for  the 
blind  therefore  must  work  more  closelv  with  researchers  and  govern- 
ment  agencies  so  that  the  products  of  the  laboratory  may  be  placed  in 
the  hands  of  the  individual  who  needs  them.  This  last  and  most  im¬ 
portant  step — in  the  continuum  from  ideas,  theories,  and  problems  to 
devices,  tests,  training  materials,  and  techniques — must  be  a  closed 
loop  circuit. 

In  the  light  of  these  essential  requirements,  let  us  raise  the  question, 
‘*\Yhat  is  the  role  of  this  Congress  and  other  like  conferences?”  Meetings 

C>  I*'' 

of  this  kind  bring  together  individual  researchers  and  specialists  in  work 
for  the  blind  and  deaf-blind  when  they  would  probably  not  otherwise  con¬ 
vene.  The  result  is  an  exchange  of  theories  as  well  as  statements  about 
what  needs  to  be  done,  what  seems  to  work,  what  fails,  and  an  over-all 


Summary  479 

conception  of  the  problems  each  faces.  Since  agencies  for  the  blind  have 
a  single  long  range  commitment  to  assist  blind  and  deaf-blind  persons  to 
overcome  their  handicaps,  it  is  logical  that  they  assume  some  responsibility 
for  research  planning,  for  the  stimulation  of  research,  for  the  exchange 
of  information,  and  for  the  implementation  of  research  through  develop¬ 
ment.  This  is  possible  only  with  the  active  cooperation  of  scientists  in 
advisory  committees,  and  the  close  cooperation  of  government  agencies 
which  act  through  the  people  to  facilitate  research.  As  the  number  of  blind 
people  grows  at  a  faster  rate  than  the  population  as  a  whole,  and  blind 
persons  enter  an  ever-increasing  variety  of  vocations,  it  is  essential  that 
agencies  for  the  blind  benefit  from  research.  There  is  growing  competition 
with  other  handicapped  groups  to  raise  funds  from  the  same  sources.  The 
very  same  appeals  for  funds  tend  to  diminish  the  positive  image  of  the 
blind  person  as  independent  and  competent.  Agencies  for  the  blind  cannot 
continue  to  expect  a  sufficiently  rapid  increase  in  funds  or  volunteer  as¬ 
sistance.  Research  must  succeed  in  work  for  the  blind  if  we  are  to  keep 
up  with  the  increased  demands  on  us. 

We  have  mentioned  that  scientists  can  perform  a  valuable  function  in 
our  field  in  an  advisory  capacity  to  the  private  agencies  for  the  blind  and 
to  governmental  agencies  which  provide  funds  and  certain  services.  With 
scientists  playing  an  ever-increasing  role  in  policy  making  in  our  federal 
government  and  abroad,  we  can  also  look  to  them  for  leadership  in  re¬ 
search  planning  and  funding  for  the  blind  and  deaf-blind.  These  roles  can 
be  best  achieved  by  individual  scientists  as  members  of  ad  hoc  committees 
which  deal  with  specific  problems  in  the  area  of  blindness  and  deaf¬ 
blindness.  Such  ad  hoc  committees  need  not  be  affiliated  with  any  organi¬ 
zation  for  the  blind.  Long  range  planning  can  be  best  secured  by  special 
conferences  which  are  carefully  limited  to  working  groups  of  the  proper 
size.  Any  organization  for  the  blind,  including  the  American  Foundation 
for  the  Blind,  which  engages  in  any  research  activity  can  benefit  from 
advisory  committees  of  scientists  whose  subjective  analysis  is  available  to 
management  and  staff. 

With  more  than  6000  deaf-blind  persons,  nearly  400,000  blind  persons, 
and  perhaps  1,000,000  functionally  blind  individuals  in  the  United  States 
alone,  it  is  unthinkable  to  waste  our  human  resources.  It  is  a  challenge 
to  all  of  us  to  contribute  significantly  to  the  worth  and  dignity  of  the 
individual.  Whether  the  motivation  is  based  upon  compassion  or  a  keen 
interest  in  the  problems  involved  is  irrelevant  as  long  as  we  continue  to 
attract  creative  minds. 


DEMOGRAPHIC  AND 


SOCIAL  ASPECTS  OF  BLINDNESS 

ERIC  JOSEPHSON 

American  Foundation  for  the  Blind,  New  York,  New  York 


Who  are  the  blind  and  what  are  their  technological  needs?  Answers  to 
such  questions  are  hard  to  come  by;  and  the  lack  of  reliable  information 
about  blindness  has  long  been  an  obstacle  to  the  planning  of  services 
and  devices  for  the  blind.  The  fact  is  that  although  blindness  is  an  end¬ 
lessly  fascinating  subject  and  in  some  countries  a  major  health  problem, 
we  know  relatively  little  about  it.  This  is  perhaps  less  the  case  in  countries 
with  centralized  services  than  in  underdeveloped  countries  or  countries  like 
ours  where  there  is  neither  a  central  agency  for  the  blind  nor  a  central 
depository  of  statistical  data  about  blindness.  Nevertheless,  in  this  paper 
I  shall  try  to  pull  together  such  information  as  we  have  about  the  charac¬ 
teristics  of  blind  persons  in  the  United  States,  particularly  material  that 
may  be  of  interest  to  those  of  you  who  are  trying  to  apply  modem  tech¬ 
nology  to  the  needs  of  the  blind. 

At  the  very  outset  it  should  be  noted  that  the  prevalence  and  charac¬ 
teristics  of  blindness  vary  widely  throughout  the  world.  According  to  the 
World  Health  Organization  (WHO),  there  are  between  10  and  15  million 
blind  people  on  earth,  most  of  them  living  in  the  underdeveloped  coun¬ 
tries.  Definitions  of  blindness  vary  and  it  is  difficult  to  get  comparable 
figures;  but  the  WHO  has  estimated  that  while  the  rate  or  prevalence  of 
blindness  in  North  America  and  Western  Europe  is  approximately  two 
per  thousand,  the  rate  elsewhere  in  Europe  is  at  least  twice  as  high,  and  in 
certain  Eastern  Mediterranean  countries  and  much  of  Africa  it  is  between 
six  and  ten  times  as  high.  Furthermore,  while  blindness  in  the  more  ad¬ 
vanced  countries  is  increasingly  associated  with  old  age,  in  the  under¬ 
developed  countries  large  numbers  of  children  are  also  affected.  It  goes 
without  saying  that  the  demand  for  services  and  devices  will  vary  widely 
according  to  the  number  and  characteristics  of  the  blind  population  in 
question.  While  the  needs  of  blind  Americans  may  not  be  too  different 
from  those  of  blind  persons  in  Western  Europe,  it  is  unlikely  that  the 


481 


482  Plenary  Session 

statistical  portrait  I  shall  present  would  fit  the  less  developed  countries. 
And  on  a  world-wide  scale  it  is  precisely  in  those  countries  where  the 
problems  of  blindness  are  most  serious. 

Limiting  ourselves  to  the  United  States,  what  do  we  find?  Blindness 
has  not  been  recorded  in  the  decennial  U.  S.  census  since  1930.  The  little 
that  we  do  know  about  the  number  and  characteristics  of  blind  persons 
is  based  largely  on  reports  from  the  few  states  that  do  collect  fairly  reliable 
statistics,  and  on  sample  surveys  conducted  by  the  federal  government  or 
by  private  agencies  such  as  the  American  Foundation  for  the  Blind.  Ac¬ 
cording  to  the  best  and  most  recent  estimate  (by  Dr.  Ralph  Hurlin  of 
the  National  Society  for  the  Prevention  of  Blindness)  there  are  approxi¬ 
mately  385,000  legally  blind  persons  in  the  United  States.  This  amounts 
to  a  prevalence  rate  of  2.14  per  thousand  population  for  the  country  as 
a  whole,  although  state  rates  vary  considerably — ranging  from  a  high  of 
3.98  in  Hawaii  to  a  low  of  1.39  in  Utah  (3).  Our  legal  or  “economic” 
definition  of  blindness  includes  all  persons  who  have  20/200  visual  acuity 
or  less  in  the  better  eye  with  correction,  or  a  comparable  visual  field  defect. 
In  ordinary  terms  this  means  anything  less  than  ten  percent  of  “normal” 
vision,  and  people  who  fall  into  this  category  are  eligible  by  law  to  receive 
various  services  and  benefits — including  special  education,  reading  services, 
vocational  training  and  placement,  special  devices  and  tools,  travel  con¬ 
cessions,  tax  exemptions,  and  financial  aid  if  they  are  needy. 

The  legal  definition  of  blindness  in  the  United  States  as  in  other  coun¬ 
tries  is  an  arbitrary  classification,  and  it  includes  some  people  with  useful 
vision  and  excludes  many  others  who  are  functionally  blind.  Relatively 
few  of  the  legally  blind  are  totally  blind.  Indeed,  such  evidence  as  is  avail¬ 
able  suggests  that  no  more  than  one-quarter  of  them  have  suffered  total 
loss  of  sight,  at  least  one-third  have  reasonably  good  travel  vision,  and 
nearly  one-fifth  have  some  degree  of  reading  vision.  It  is  hardly  surprising 
to  find,  as  we  did  in  a  survey  of  nearly  700  legally  blind  adults  in  four 
states,  that  half  of  them  told  us  that  they  do  not  consider  themselves  blind. 
Why  should  they?  When  respondents  with  more  than  light  perception  were 
asked  to  imagine  themselves  blindfolded,  70  percent  said  they  would  be 
unable  to  function  as  travelers  or  in  doing  their  housework.  In  other  words, 
many  persons  regarded  by  society  as  blind  use  their  remaining  vision. 

On  the  other  hand,  there  are  many  others  not  regarded  as  blind  who 
must  cope  with  fairly  serious  visual  impairment.  This  is  clearly  shown  in 
the  recent  U.  S.  National  Health  Survey.  In  that  survey,  based  on  house¬ 
hold  interviews  with  a  nationwide  sample  of  the  population,  people  were 


Demographic  and  Social  Aspects  483 

classified  as  blind  if  they  had  a  visual  impairment  severe  enough  so  that 
they  were  unable  to  read  ordinary  newspaper  print  even  with  the  aid  of 
glasses.  Here  we  have  what  has  been  called  a  functional  or  behavioral 
definition  of  blindness;  that  is,  it  describes  people  who  behave  as  if  they 
were  blind  and  undoubtedly  includes  many  who  are  not  legally  blind  or 
whose  visual  acuity  could  be  improved  considerably  with  optical  aids. 
On  the  basis  of  this  definition,  960,000  persons  were  reported  as  blind 
in  1957-1958,  a  rate  nearly  three  times  as  great  as  that  ever  reported  for 
legal  blindness.  In  addition  the  National  Health  Survey  reported  more 
than  two  million  cases  of  other  severe  visual  impairment — persons  who 
were  blind  in  one  eye  but  with  sight  in  the  other,  or  those  who  had  poor 
vision  or  trouble  seeing  in  one  or  both  eyes,  but  whose  visual  impairment 
was  less  severe  than  functional  blindness  (4).  While  this  Congress  was 
concerned  chiefly  with  the  small  group  of  legally  blind  people  who  are 
most  urgently  in  need  of  “technical  assistance,”  it  is  worth  noting  the 
existence  of  a  much  larger  group  who,  if  not  blind  in  the  legal  sense  of 
the  term,  face  serious  restrictions  on  activity  and  mobility. 

Regarding  the  legally  blind,  it  can  be  said  first  that  they  are  growing 
in  number.  If  our  estimates  are  accurate  they  have  increased  by  approxi¬ 
mately  67  percent  since  1940,  while  the  total  population  of  the  United 
States  has  risen  by  only  36  percent  during  this  period.  The  explanation  for 
this  trend  is  that  blindness  is  increasingly  associated  with  the  diseases  of 
old  age;  and  age  is  perhaps  the  most  significant  bio-social  characteristic 
of  blind  persons  today.  Thus,  two  states  that  collect  fairly  reliable  statistics 
on  blindness — North  Carolina  and  Massachusetts — report  that  nearly  half 
of  all  their  blind  residents  were  65  years  of  age  and  over  in  1960.  Indeed, 
the  prevalance  or  rate  of  blindness  among  persons  65  and  over  is  30  times 
as  great  as  among  those  under  20.  Furthermore,  new  cases  of  blindness — 
of  which  the  three  major  causes  are  cataracts,  glaucoma,  and  diabetes — 
are  far  more  likely  to  occur  among  older  than  among  younger  age  groups : 
in  1957  well  over  half  of  all  new  cases  of  blindness  in  the  United  States 
struck  persons  65  years  of  age  and  over.  According  to  the  National  Health 
Survey,  persons  defined  as  functionally  blind  are  even  more  likely  to  be 
counted  among  the  aged — over  two-thirds  are  65  and  over — than  the 
legally  blind. 

In  short,  blind  people  are  likely  to  be  old  and  old  people  are  likely 
to  experience  blindness.  With  increasing  longevity  our  blind  population 
is  probably  going  to  increase  further,  unless  the  diseases  of  old  age  as¬ 
sociated  with  blindness  are  checked.  As  we  shall  see,  the  disproportionate 


484  Plenary  Session 

number  of  aged  blind  persons  raises  fairly  serious  questions  for  anybody 
who  wants  to  plan  services  and  devices  for  them.  But  before  looking  into 
these  questions,  it  may  be  helpful  to  review  some  of  the  other  major 
characteristics  of  our  blind  population. 

Precisely  because  they  are  so  old,  blind  persons  are  more  likely  than 
the  general  population  to  suffer  from  other  chronic  conditions  or  ailments 
besides  loss  of  sight.  In  our  four-state  survey  of  700  blind  adults  mentioned 
above,  nearly  two-thirds  of  our  respondents  reported  that  they  had  some 
other  chronic  condition  or  ailment.  Of  course,  these  conditions  are  not  all 
equally  disabling,  but  in  varying  degree  they  impose  further  limitations 
on  the  activity  and  mobility  of  the  blind  individual. 

Another  characteristic  of  the  blind  in  the  United  States  is  that  they 
are  less  educated  than  the  general  population.  In  our  four-state  survey, 
the  proportion  of  adults  with  at  least  some  college  education  was  approxi¬ 
mately  the  same  as  in  the  general  population.  But  while  two-thirds  of 
all  Americans  18  years  of  age  and  above  had  at  least  some  high  school 
education  in  1959,  the  corresponding  figure  in  our  blind  sample  was  close 
to  one-half.  This  too  may  be  a  function  of  their  greater  age,  since  as 
measured  by  years  of  schooling  older  age  groups  are  generally  less  edu¬ 
cated  than  the  younger  ones. 

Blind  persons  are  also  less  likely  to  be  working  than  their  sighted 
compatriots.  This  of  course  is  largely  but  by  no  means  entirely  due  to 
their  age.  While  more  than  half  of  the  total  civilian  noninstitutional  popu¬ 
lation  of  the  U.  S.  are  employed,  we  estimate  on  the  basis  of  our  four- 
state  sample  survey  that  less  than  a  quarter  of  blind  adults  and  no  more 
than  10  percent  of  all  blind  persons  are  presently  employed.  (Under-em¬ 
ployment  is  also  typical  of  blind  people  in  England  and  Wales,  where  in 
1957  less  than  a  third  of  all  blind  residents  of  working  age  and  only  11 
percent  of  all  blind  residents  were  working.) 

It  is  not  surprising,  therefore,  that  blind  persons  report  far  lower  in¬ 
comes  than  the  general  population.  While  little  more  than  one-eighth  of  all 
American  families  have  an  annual  income  of  less  than  $2000,  the  cor¬ 
responding  figure  among  respondents  in  our  four-state  survey  was  more 
than  one-half.  Larger  numbers  of  blind  people  are  literally  wards  of  the 
state:  approximately  half  of  all  blind  persons  in  the  United  States  are 
receiving  direct  financial  help  under  the  federal-state  program  of  aid  to 
the  blind  and  old  age  assistance.  But  as  recently  as  1960  the  average 
monthly  payment  to  blind  recipients  of  public  assistance  was  less  than  $70. 

Old,  idle,  poor,  dependent — such  is  the  fate  of  many  blind  citizens  of 


Demographic  and  Social  A  spects  485 

our  affluent  society.  What  this  means  to  the  individuals  concerned  is  often 
a  terrible  social  isolation — that  is,  being  cut  off  from  the  economic,  social, 
and  cultural  life  of  the  community.  Of  course  there  is  a  wide  and  com¬ 
plex  net  work  of  educational  institutions  and  social  agencies  (public  and 
private)  which  seeks  to  provide  blind  persons  with  the  help  and  services 
they  need;  but  even  this  organizational  apparatus  is  unable  to  reach  all 
the  blind  people  who  need  help.  In  any  community  there  are  a  significant 
minority — perhaps  as  many  as  a  third — of  the  blind  population  which 
remain  hidden  or  unknown — that  is,  unknown  to  the  agencies  operating 
in  the  area  and  untouched  by  the  many  rehabilitation,  education,  voca¬ 
tional,  and  recreational  programs  which  they  have  established.  For  example, 
the  federal  government  and  the  states  have  undertaken  an  ambitious  pro¬ 
gram  of  vocational  rehabilitation  of  the  physically  handicapped.  But  rela¬ 
tively  few  blind  people — no  more  than  a  quarter  of  the  respondents  in  our 
sample  survey — have  been  reached  by  that  program.  One  of  the  most  im¬ 
portant  research  tasks  in  the  future  will  be  to  find  out  about  the  characteris¬ 
tics  and  needs  of  the  unknown  blind. 

Let  us  turn  now  to  consider  at  somewhat  greater  length  two  other 
aspects  of  blindness  which  may  have  special  interest  for  you:  mobility  and 
reading  behavior.  These  are  areas  in  which  much  of  the  research  and  in¬ 
strumentation  you  are  developing  will  have  particular  application.  But 
if  guidance  devices  and  reading  machines  are  to  be  further  developed,  who 
needs  help  most?  This  is  the  question  I  shall  now  try  to  answer. 

To  begin  with,  it  may  be  helpful  to  distinguish  between  limitations  on 
activity — such  as  housework  and  employment — and  restrictions  on  mo¬ 
bility — which  may  range  from  confinement  to  the  house,  at  one  extreme, 
to  limited  mobility  or  need  of  help  in  moving  around  outside  the  house, 
at  the  other.  While  closely  related,  the  two  limitations  are  different  in 
nature  and  in  impact.  Thus,  according  to  the  U.  S.  National  Health  Survey 
of  Impairments,  persons  defined  as  “visually  impaired”  (most  of  them  not 
legally  blind)  were  far  more  likely  to  experience  some  limitation  of  activity 
than  of  mobility.  Indeed,  among  those  45  years  of  age  and  over,  nearly 
40  percent  faced  some  limitation  of  activity  (18  percent  suffering  major 
limitation)  while  less  than  25  percent  faced  some  restriction  of  mobility 
(5  percent  with  a  major  restriction)  (5 ) . 

But  for  persons  with  more  serious  visual  impairment — the  legally 
blind — restrictions  on  mobility  constitute  the  most  important  problem 
they  face.  Just  how  important  is  suggested  by  the  results  of  our  four-state 


486  Plenary  Session 

survey  of  some  700  blind  adults.  Asked  by  our  interviewers,  “What  would 
you  say  is  the  most  important  problem  faced  by  a  blind  person?”  nearly 
half  of  our  respondents  mentioned  getting  around,  travelling,  or  falling 
down.  Indeed,  travel  was  mentioned  nearly  twice  as  often  as  the  next  most 
important  problem:  dependence.  Why  this  is  so  is  easy  enough  to  ex¬ 
plain:  most  blind  people  cannot  travel  unaided.  It  may  surprise  you  to  learn 
that  as  many  as  a  third  of  them  do  travel  unaided  (according  to  two  recent 
American  surveys);  but  the  fact  remains  that  the  great  majority  need  some 
kind  of  help.  According  to  a  study  recently  conducted  for  the  Seeing 
Eye,  Inc. — our  most  authoritative  source  of  information  about  the  mobility 
of  blind  persons — approximately  one-third  of  the  blind  (aged  15  to  54) 
rely  on  a  sighted  companion  as  their  primary  mode  of  travel,  nearly  one- 
fifth  use  a  cane,  and  only  6  percent  use  a  guide  dog  (2).  (Actually,  our 
own  research  on  an  older  blind  population  shows  much  greater  reliance  on 
canes  and  much  less  use  of  guide  dogs.) 

As  may  be  expected,  the  mode  of  travel  selected  by  the  blind  person 
and  his  efficiency  in  getting  about  vary  with  such  factors  as  degree  of 
visual  loss,  age,  and  general  physical  condition.  Thus  the  more  useful 
vision  a  blind  person  has  (and  remember  that  a  fairly  large  proportion  of 
the  blind  do  have  some  travel  vision)  the  more  likely  that  he  will  be  able  to 
travel  unaided.  But  the  relationship  between  vision  and  mode  of  travel  is 
not  always  so  simple.  According  to  the  Seeing  Eye  study  nearly  half  of 
those  who  rely  on  human  guides  had  more  than  light  perception,  and  al¬ 
though  cane  users  as  a  group  had  less  vision  than  the  human  guide  users, 
they  travel  more  independently.  Vision  alone  does  not  explain  the  selection 
of  travel  techniques. 

On  the  other  hand,  vision  is  closely  related  to  the  frequency,  exten¬ 
siveness,  and  independence  of  travel  and  to  general  travel  efficiency.  In 
the  Seeing  Eye  survey  only  one-quarter  of  the  respondents  travelled  with 
any  frequency,  and  the  more  vision  remaining  to  them,  the  more  frequently 
they  travelled.  Similar  statements  can  be  made  about  the  extensiveness  of 
travel.  Only  half  of  the  blind  people  in  the  Seeing  Eye  study  travelled  ex¬ 
tensively,  and  those  who  did  were  more  likely  to  have  some  useful  vision. 
As  for  travel  independence,  or  the  ability  to  visit  unfamiliar  places,  only 
a  third  of  the  persons  interviewed  were  scored  as  high  in  this  respect;  as 
might  be  expected  those  with  more  than  light  perception  were  more  likely 
to  achieve  such  independence  than  those  who  had  no  useful  vision.  After 
developing  a  composite  or  general  measure  of  travel  efficiency,  the  authors 
of  the  Seeing  Eye  study  found  that  less  than  one-quarter  of  their  re- 


Demographic  and  Social  A  spects  487 

spondents  achieved  “high”  travel  efficiency,  while  three-quarters  failed  to 
reach  a  minimum  standard  of  mobility  in  terms  of  frequency,  extensiveness, 
and  independence  of  travel.  Again,  persons  with  more  than  light  perception 
were  more  likely  to  become  efficient  travellers  than  those  who  were  totally 
blind  or  had  no  more  than  light  perception. 

So  far  we  have  considered  travel  performance  and  its  relationship  to 
the  degree  of  visual  loss.  Now  we  shall  discuss  travel  performance  as  it  is 
affected  by  the  mode  of  travel.  How  do  blind  travellers  vary  according 
to  the  methods  they  use?  First,  let  us  look  at  blind  persons  who  rely  on 
sighted  guides.  In  the  Seeing  Eye  study  they  made  up  nearly  one-third  of 
the  entire  sample;  they  turned  out  to  have  more  vision  than  cane  or  dog 
guide  travellers,  although  less  than  blind  people  who  travel  unaided.  How¬ 
ever,  their  travel  performance  was  lower  than  that  of  any  other  group, 
regardless  of  the  degree  of  visual  loss.  How  can  this  be  explained?  The 
authors  of  the  Seeing  Eye  study  provide  an  answer:  “It  is  as  if  once  the 
human  guide  mode  of  travel  is  depended  on,  a  markedly  restricted  pattern 
of  travel  performance  is  enforced,  regardless  of  visual  status.”  On  the 
other  hand,  cane  travellers  (nearly  a  fifth  of  the  sample  and  with  less 
vision  than  most  other  blind  travellers)  achieve  much  higher  levels  of 
travel  performance  than  persons  relying  on  sighted  guides,  although  they  do 
not  get  about  as  well  as  dog  guide  users  or  those  who  travel  unaided.  Curi¬ 
ously,  cane  users  with  low  vision  appear  to  be  more  efficient  travellers  than 
those  with  high  vision,  perhaps  because  they  are  more  likely  to  have  re¬ 
ceived  training  in  the  use  of  the  cane.  As  for  dog  guide  users,  they  are 
extremely  few  in  number  (the  smallest  of  all  travel  groups  among  the 
blind)  and  the  poorest  in  vision.  By  most  measures  of  travel  efficiency 
they  do  better  than  cane  users.  The  last  group  to  be  considered  is  com¬ 
posed  of  blind  persons  who  travel  unaided.  Most  of  them  score  high  in 
vision  and,  as  may  be  expected,  they  achieve  greater  travel  efficiency  than 
cane  users  and  those  who  rely  on  human  guides,  although  they  do  no  better 
than  dog  guide  users  in  the  high  vision  categories. 

Summing  up,  blind  persons  relying  on  human  guides  achieve  low  levels 
of  travel  performance,  cane  users  occupy  an  intermediate  position,  while 
dog  guide  users  do  better  than  both  of  the  other  groups.  Whatever  the  mode, 
few  blind  travellers  do  very  well.  It  is  hardly  surprising  therefore  that  rela¬ 
tively  few  blind  people  are  satisfied  with  their  travel  performance.  Indeed, 
the  authors  of  the  Seeing  Eye  study  conclude:  “Objectively,  the  potential 
for  fully  satisfactory  travel  performance  is  not  reached  by  most  blind  per¬ 
sons.  Moreover,  there  is  a  general  trend  of  dissatisfaction  with  travel  per- 


488  Plenary  Session 

formance.  Yet  the  dissatisfaction  is  not  translated  into  active  efforts  to 
improve  travel  performance,  or  to  give  considerable  thought  to  the  selec¬ 
tion  of  a  travel  mode.  What  results  is  a  low  level  of  performance  and  satis¬ 
faction,  a  situation  which  seems  to  be  accepted  by  the  majority  of  blind 
persons  as  part  of  the  limitations  imposed  by  blindness.  This  is,  of  course, 
an  interpretation  of  one  side  of  the  situation.  The  other  side  is  that  there 
may  be  insufficient  stimulation  and  help  because  of  over-protective  attitudes 
of  family  and  community,  and  because  of  the  insufficiency  of  highly  skilled 
resources  for  rehabilitation”  (2,  p.  5 1 ) . 

The  implications  for  those  of  you  interested  in  guidance  devices  are 
plain  enough.  Most  blind  people  face  serious  difficulties  in  getting  about, 
are  dissatisfied  with  their  travel  performance,  and  are  in  need  of  training 
and  help.  But  so  far  very  few  of  them  have  received  the  help  they  need 
(in  our  own  survey  of  blind  adults,  only  15  percent  of  our  respondents 
had  been  given  some  form  of  travel  training).  No  wonder  many  of  them 
consider  travel  the  most  important  problem  they  face.  In  a  highly  mobile 
society  like  ours,  travel  restrictions  represent  major  barriers  to  the  achieve¬ 
ment  of  a  richer,  fuller  life.  If  the  blind  are  socially  isolated  and  limited 
in  their  social,  cultural,  and  economic  activities,  it  is  due  in  considerable 
part  to  the  very  severe  restrictions  on  mobility  which  they  experience. 
Let  me  give  you  a  few  examples.  In  our  own  survey  we  gave  respondents 
a  choice  and  asked  them  which  one  of  these  four  things  they  would  rather 
do — watch  TV,  listen  to  the  radio,  read  or  listen  to  a  book,  or  visit  friends. 
More  than  40  percent  of  them  said  they  would  rather  visit  friends — con¬ 
siderably  more  than  mentioned  any  of  the  other  activities.  Now  we  found 
that  60  percent  of  them  actually  get  together  with  friends  at  least  once 
a  week — a  surprisingly  high  proportion.  Nevertheless,  nearly  two-thirds 
of  those  whose  blindness  began  after  age  13  told  us  that  they  get  together 
with  friends  less  now  than  before  their  trouble  with  seeing  began.  In  other 
words,  visiting  with  friends  is  the  favorite  form  of  leisure  behavior  among 
blind  adults.  They  engage  in  a  fairly  heavy  pattern  of  visiting,  but  still 
do  so  less  than  before  their  loss  of  sight. 

To  get  a  more  complete  picture  of  the  social  life  of  blind  adults,  we 
devised  an  index  combining  visiting  with  friends  and  participation  in  clubs 
or  organizations.  This  composite  yields  a  measure  of  active  participation  at 
one  extreme  and  social  isolation  at  the  other.  What  we  found  was  that 
one-fifth  of  our  respondents  were  quite  active:  they  visited  with  friends 
at  least  once  a  week  and  also  attended  meetings  at  least  once  in  a  while. 
At  the  other  extreme  were  one-quarter  of  our  sample  who  can  be  con- 


Demographic  and  Social  A  spects  489 

sidered  socially  isolated:  they  visited  with  friends  no  more  than  two  or 
three  times  a  month  and  participated  in  no  organizational  life  whatsoever. 
It  is  this  group  which  is  most  deprived  of  social  intercourse  and  most  in 
need  of  help.  Many  blind  people  lead  lonely  lives:  in  our  survey  more 
than  a  third  had  no  other  family  members  living  in  their  communities. 
Although  a  third  of  them  said  that  they  prefer  to  do  things  alone,  this  may 
be  due  to  the  fact  that  they  are  alone  much  of  the  time. 

For  an  additional  measure  of  our  respondents’  social  life,  we  con¬ 
structed  a  weighted  index  of  activities,  including  employment,  length  of  the 
work  week,  shopping,  visiting  friends,  membership  in  clubs  or  organiza¬ 
tions,  and  attending  church.  We  found  that  while  more  than  a  fifth  of 
them  scored  high  in  such  social  activities,  an  equal  proportion  scored 
very  low.  A  similar  distribution  was  found  when  we  measured  nonsocial 
or  more  sedentary  activities,  including  listening  to  the  radio,  watching  TV, 
reading  books,  going  to  the  movies,  and  hobbies.  No  matter  how  we 
measured  “activity”  we  found  that  between  15  and  25  percent  of  the  blind 
adults  in  our  sample  were  extremely  inactive.  As  you  would  expect,  we 
found  that  activity  varies  with  the  amount  of  travel  vision,  education,  in¬ 
come,  physical  condition,  travel  training,  and  most  notably  with  age.  That 
is,  those  most  likely  to  be  active  were  people  with  good  travel  vision,  the 
better  educated,  the  higher  income  groups,  those  without  other  chronic 
conditions,  those  who  had  received  travel  training,  and  the  younger  age 
groups. 

More  sophisticated  guidance  devices  and  more  travel  training  will  not 
automatically  enable  blind  people  to  lead  richer  lives;  but  they  will  certainly 
make  it  easier  to  attack  other  barriers  which  tend  to  isolate  them  from  the 
community.  To  help  in  overcoming  these  barriers  would  seem  to  be  among 
the  more  important  goals  for  technological  research. 

If  blind  people  are  seriously  restricted  in  mobility,  they  also  are  limited 
in  their  access  to  other  forms  of  communication,  especially  the  spoken 
or  printed  word.  This  statement  does  not  apply  so  much  to  the  broadcast 
media  as  it  does  to  books  and  other  publications.  Thus,  according  to  our 
recent  survey  of  leisure  behavior  among  the  adult  blind,  nine  out  of  ten 
listen  to  the  radio  and  one-quarter  of  our  respondents  told  us  that  they 
listen  four  hours  or  more  daily.  As  may  be  expected,  television  is  less 
important  in  their  lives.  Nevertheless,  three-quarters  of  our  respondents 
said  that  they  watch  television;  and  10  percent  of  them  watch  it  four  or 
more  hours  a  day.  In  other  words,  the  broadcast  media  have  come  to  play 


490  Plenary  Session 

a  very  important  part  in  the  daily  lives  of  blind  people,  so  many  of  whom 
are  elderly  and  confined  to  their  homes.  In  their  attachment  to  radio  and 
television,  blind  people  are  not  very  different  from  their  sighted  neighbors. 

But  if  blind  persons  make  heavy  use  use  of  radio  and  television,  the 
same  cannot  be  said  of  their  reading  behavior,  which  I  shall  now  discuss 
at  greater  length.  Although  great  efforts  have  been  devoted  to  providing 
blind  people  with  books,  the  fact  is  that  many  of  them  do  not  read  any¬ 
thing.  In  our  four-state  survey  of  blind  adults  we  inquired  about  their 
reading  interests  and  problems;  and  while  we  have  not  yet  completed  our 
analysis,  I  can  give  you  some  preliminary  findings  from  this  study.  Perhaps 
our  major  conclusion  was  that  while  large  numbers — perhaps  as  many  as 
half — of  all  blind  adults  have  had  no  exposure  to  books  or  other  publica¬ 
tions,  they  are  nevertheless  more  likely  to  read  books  than  their  sighted 
contemporaries.  But  then  Americans  in  general  are  known  to  be  light 
readers,  particularly  when  compared  with  Western  Europeans. 

Actually  it  is  difficult  to  compare  the  reading  behavior  of  blind  and 
sighted  people.  First  of  all,  the  modes  of  reading  differ  widely,  as  does  the 
distribution  of  reading  matter.  Thus  while  sighted  persons  can  obtain  a 
huge  number  of  books  from  shops  and  libraries  in  their  communities,  blind 
readers  depend  almost  exclusively  on  a  regional  library  system  which  can 
produce  only  a  small  sampling  of  the  titles  in  ordinary  print  and  must 
distribute  them  by  mail.  Furthermore,  there  has  been  no  study  of  general 
reading  behavior  in  the  United  States  for  more  than  a  dozen  years  ( 1 ) . 
Past  studies,  however,  suggested  that  not  much  more  than  a  quarter  of 
the  sighted  population  read  books.  As  noted  above,  our  own  survey  in¬ 
dicates  that  half  of  the  blind  population  are  readers.  Comparing  actual 
readers  in  the  two  populations,  we  found  not  only  that  blind  people  are 
more  likely  to  read  than  sighted  persons  but  that  they  are  also  more  likely 
to  be  heavy  readers.  In  the  last  national  survey  of  sighted  readers,  only  8 
percent  were  identified  as  “heavy”  readers  (i.e.,  had  read  more  than  four 
books  during  the  previous  month).  In  our  sample  of  blind  adults  the 
proportion  of  heavy  readers  was  twice  as  large:  17  percent.  This  is  all  the 
more  striking  when  one  realizes  that  our  survey  was  limited  to  blind  persons 
20  years  and  older;  and  we  know  that  reading  declines  with  increasing  age. 

If  these  figures  seem  surprising,  it  must  be  noted  that  we  defined  read¬ 
ing  among  the  blind  to  include  not  only  braille  and  records  but  ordinary 
print  (in  our  sample  14  percent  had  reading  vision)  and  reliance  on  sighted 
readers  as  well.  The  addition  of  sighted  readers  and  ordinary  print  gave 
us  a  higher  proportion  of  readers  than  we  would  have  obtained  if  we  had 


Demographic  and  Social  Aspects  49 1 

limited  ourselves  to  braille  and  records  alone.  Indeed,  when  we  asked  about 
their  primary  mode  of  reading,  we  found  that  while  more  than  half  of 
our  readers  used  records,  the  next  largest  group  (over  one-quarter)  read 
with  the  help  of  sighted  readers.  More  striking  still,  the  proportion  who 
read  ordinary  inkprint  (9  percent)  was  larger  than  the  number  who  read 
braille  (8  percent).  Of  course,  many  blind  readers  use  more  than  one 
technique  of  reading.  The  Library  of  Congress  reports  that  16  percent  of 
its  readers  use  both  records  and  braille. 

What  do  these  figures  mean?  First  of  all,  they  reflect  the  technological 
revolution  in  reading  which  was  brought  about  by  the  development  of  the 
long  playing  record.  More  blind  readers  depend  on  records  than  on  all  other 
modes  of  reading  combined.  Throughout  the  country  some  65,000  blind 
persons  receive  talking  book  records  from  the  Library  of  Congress  and 
our  system  of  regional  libraries  for  the  blind.  At  the  same  time  there  ap¬ 
pears  to  have  been  a  significant  drop  during  the  past  20  years  in  the  number 
of  braille  readers.  In  1940,  6  percent  of  the  total  estimated  blind  population 
in  the  United  States  were  being  sent  braille  books  by  the  regional  libraries 
for  the  blind.  By  1960  the  proportion  of  braille  readers  had  fallen  to  2 
percent.  In  our  own  survey  of  blind  adults  braille  readers  represented  only 
28  percent  of  all  those  able  to  read  braille.  That  is,  little  more  than  a 
quarter  of  all  blind  respondents  with  the  ability  to  read  braille  were 
actually  reading  braille  books.  This  figure  was  reduced  further  when  we 
measured  braille  readers  as  a  proportion  of  all  blind  book  readers  in  our 
sample.  Here  the  figure  fell  to  a  little  more  than  15  percent.  And  when  we 
counted  braille  readers  as  a  proportion  of  our  total  sample,  only  8  percent 
turned  out  to  be  presently  reading  braille.  As  noted  earlier,  even  fewer  rely 
chiefly  on  braille  as  a  mode  of  reading. 

In  short,  the  evidence  we  have  accumulated  suggests  that  braille  is 
relatively  insignificant  as  a  reading  device,  at  least  among  blind  adults, 
although  it  has  traditionally  been  important  in  the  education  of  young  blind 
persons.  Nevertheless,  a  recent  study  of  reading  among  blind  American 
college  students  showed  that  their  principal  study  technique  was  listening 
to  the  spoken  word:  43  percent  of  their  textbook  reading  was  done  with 
the  help  of  sighted  readers,  27  percent  on  records,  15  percent  on  tapes,  9 
percent  in  inkprint,  and  only  4  percent  in  braille.  Most  blind  students  also 
prefer  to  have  their  textbook  material  in  recorded  form.  The  trend  is  most 
definitely  toward  the  use  of  recorded  materials,  including  tapes. 

On  the  whole  most  blind  readers  are  satisfied  with  the  services  being 
provided  them.  Only  3  percent  of  the  book  readers  in  our  survey  ex- 


492  Plenary  Session 

pressed  general  dissatisfaction  with  the  library  services  available  to  them; 
and  relatively  few  said  that  they  had  had  any  difficulty  receiving  or  re¬ 
turning  books  or  records  through  the  mail — only  about  13  percent.  A 
much  larger  proportion  (28  percent)  of  the  readers  said  that  there  were 
books  that  they  would  like  to  read  that  are  not  now  available  to  them  in 
records  or  braille.  Almost  one-quarter  of  them  said  there  were  ways  in 
which  library  services  in  their  areas  could  be  improved.  But  most  who  rely 
on  these  services  were  pretty  well  satisfied  with  the  selection  of  books  by 
the  Library  of  Congress  and  with  the  work  being  done  by  their  regional 
libraries.  They  do  sometimes  express  complaints  about  the  condition  of  the 
braille  books  they  receive  (reporting  that  some  of  them  are  dog-eared, 
some  torn,  some  battered)  or  about  the  problem  of  carrying  containers 
around  and  mailing  and  receiving  them.  They  are  more  likely,  however — 
especially  in  the  case  of  a  small,  vocal,  and  more  sophisticated  minority — to 
complain  about  the  lack  of  certain  types  of  books  in  which  they  are  inter¬ 
ested.  But  considering  the  necessity  to  select  just  a  few  titles  from  the 
many  thousands  available  in  ordinary  print,  it  would  be  surprising  if  this 
were  not  the  case. 

A  word  now  about  the  importance  of  tape.  In  the  United  States,  where 
maximum  effort  in  the  production  of  books  for  blind  readers  has  been  put 
into  disc  recordings  and  braille,  the  development  of  tapes  has  so  far  been 
unplanned,  and  for  the  most  part  blind  people  themselves  must  make  the 
necessary  investment  in  expensive  tape  equipment.  Nevertheless,  in  our 
four-state  sample  survey  of  blind  adults,  some  10  percent  reported  that  they 
had  tape  recorders  of  their  own;  and  a  Library  of  Congress  survey  found 
that  nearly  one-fifth  of  a  sample  of  readers  had  tape  recorders.  On  the  other 
hand,  only  1  percent  of  the  readers  in  our  survey  said  that  they  do  most  of 
their  reading  on  tape — a  reflection  no  doubt  of  the  relative  scarcity  of 
taped  materials  or  of  serious  difficulties  in  distributing  such  materials.  In 
this  country  at  least  tapes  have  yet  to  make  their  full  impact  on  reading 
behavior. 

Of  course,  books  are  not  the  only  reading  materials  which  reach  the 
blind.  In  our  sample  survey  half  of  our  respondents  reported  that  they 
were  having  newspapers  read  to  them  (although  only  one-fifth  of  them  do 
so  on  a  regular  basis).  Furthermore,  more  than  40  percent  of  them  were 
receiving  magazines  in  braille,  on  records,  or  in  ordinary  print;  and  more 
than  a  fifth  were  having  magazines  read  to  them.  In  most  cases  (three- 
quarters,  to  be  exact)  it  is  other  family  members  who  perform  this  vital 


service. 


Demographic  and  Social  A  spects  49  3 

Great  strides  have  been  made  in  providing  blind  persons  with  reading 
materials  and  services,  but  it  is  still  just  a  beginning  and  there  is  scarcely 
room  for  complacency.  Thus,  in  our  interviews  with  adult  blind  readers  we 
asked  whether  they  were  reading  as  many  books  as  they  would  like  to.  Little 
more  than  half  replied  affirmatively.  Then  too,  it  is  clear  that  for  many  of 
those  who  became  blind  in  adult  life  and  presumably  had  read  ordinary 
print  prior  to  their  loss  of  sight,  blindness  leads  to  a  decline  in  reading.  In 
our  survey  more  than  half  of  those  whose  blindness  began  after  age  30 
reported  that  they  were  now  reading  less  than  before  their  trouble  with 
seeing  began.  There  is  still  much  to  be  done  to  satisfy  those  who  are  pres¬ 
ently  reading. 

More  important,  what  about  the  many  blind  persons  in  the  United 
States  who  read  nothing?  The  government  program  of  talking  books  and 
braille  books  reaches  approximately  one  out  of  seven  blind  persons  in  the 
country.  Our  own  survey  suggests  that  if  other  modes  of  reading  are  added 
(especially  the  help  of  sighted  readers  and  ordinary  inkprint)  the  propor¬ 
tion  of  readers  is  somewhat  larger.  Even  so,  large  numbers  of  blind  persons 
have  never  had  any  reading  experiences.  In  every  sense  of  the  word  this  is 
an  untapped  market — perhaps  as  large  as  200,000 — for  the  development 
and  distribution  of  reading  materials. 

At  this  point  it  needs  to  be  stressed  that  the  most  extra-ordinary  reading 
machines  to  be  developed  in  the  future  and  even  a  great  increase  in  the 
number  of  titles  produced  will  not  guarantee  that  blind  nonreaders  can  be 
“converted”  into  active  readers.  First,  they  must  be  informed  about  the 
devices  and  services  available  to  them.  In  our  survey  more  than  a  quarter 
of  the  nonreaders  had  never  heard  about  the  talking  book  program  admin¬ 
istered  by  the  Library  of  Congress,  although  this  program  is  30  years  old. 
Second,  they  must  be  motivated  to  read.  Again,  in  our  survey  40  percent 
of  the  nonreaders  said  they  felt  no  need  for  any  of  the  book  services  avail¬ 
able  to  them  and  could  not  think  of  anything  that  would  make  them  want 
to  read.  It  would  be  naive  to  expect  that  all  blind  persons  can  become  read¬ 
ers;  even  if  only  a  few  can  be  helped,  the  effort  will  be  worthwhile.  But  if 
they  are  to  be  helped,  great  efforts  in  education  will  be  needed.  Without  such 
education  people  will  not  read,  however  easy  it  is  made  for  them.  These  are 
just  some  of  the  obstacles  to  be  overcome. 

Another  obstacle,  common  in  the  development  of  new  technological 
devices,  is  resistance  to  change.  As  noted  earlier,  most  blind  readers  are 
fairly  well  satisfied  with  the  services  they  are  getting.  In  our  survey  we 
asked  respondents  whether  they  would  be  interested  in  getting  records  that 


494  Plenary  Session 

offered  twice  or  even  four  times  as  many  hours  of  listening  as  the  talking 
book  records  currently  in  use.  Some  60  percent  said  that  they  would  be 
interested,  but  nearly  a  third  said  that  they  were  not  at  all  interested.  When 
asked  about  their  attitudes  toward  multitrack  tapes,  the  response  was  very 
nearly  the  same.  I  am  not  suggesting  that  this  resistance  to  change  cannot  be 
overcome,  but  I  am  urging  that  it  be  recognized.  Considerable  effort  will 
have  to  be  made  to  educate  and  prepare  blind  persons  for  the  many  future 
revolutions  in  reading  devices  and  methods. 

It  is  time  to  sum  up  this  brief  sketch  of  the  characteristics  and  needs 
of  blind  Americans.  I  have  tried  to  give  you  some  idea  of  the  populations 
that  will  be  helped  by  the  devices  you  are  working  on.  (Most  of  what  I  have 
said  applies  to  the  relatively  small  number  of  legally  blind.  But  while  this 
group  is  most  seriously  in  need  of  help,  many  others — not  legally  blind  yet 
severely  impaired  visually — may  also  be  considered  as  potential  users  of 
more  advanced  guidance  and  reading  devices.)  First  of  all,  at  least  in  the 
United  States,  the  legally  blind  are  growing  in  number  and  are  increasingly 
an  elderly  group.  (It  has  been  estimated  that  three-quarters  of  a  million 
persons  now  living  will  become  blind  unless  preventive  efforts  can  be  made 
more  effective.)  Economically  underprivileged  and  forced  into  a  leisure  for 
which  they  are  poorly  prepared,  blind  persons  find  themselves  cut  off  from 
many  of  the  social  and  cultural  activities  of  their  communities.  Perhaps 
above  all  they  are  isolated  physically.  The  most  important  problem  they 
face  is  restriction  on  mobility:  few  are  trained  to  travel,  and  whatever  mode 
they  employ  most  fail  to  achieve  a  reasonable  efficiency  in  getting  about. 
In  a  second  major  area  of  communication,  reading  behavior,  the  picture  is 
also  one  of  deprivation.  At  least  half  of  all  blind  adults  do  little  or  no  read¬ 
ing,  and  many  of  those  who  do  use  braille,  records,  or  tapes  are  reading 
much  less  than  they  would  like  to. 

In  view  of  the  increasing  number  of  blind  persons  and  the  likelihood 
that  blindness  will  increase  among  the  elderly,  there  is  nothing  more  im¬ 
portant  than  to  help  them  achieve  greater  efficiency  and  satisfaction  as 
travellers  and  as  readers.  Meanwhile,  as  we  learn  more  about  the  char¬ 
acteristics  and  needs  of  blind  people  throughout  the  world  and  as  you 
progress  in  the  development  of  new  devices — two  efforts  that  should  be 
coordinated — we  can  look  forward  to  a  day  when  all  blind  people  will  be 
able  to  share  more  equitably  in  the  life  of  their  communities  and  win  for 
themselves  a  fuller  and  richer  life. 


Demographic  and  Social  A  spects 


495 


REFERENCES 

1.  Campbell,  Angus,  and  Charles  A.  Metzner.  Public  Use  of  the  Library  and  of 

Other  Sources  of  Information.  Ann  Arbor,  Michigan:  University  of  Michigan, 
1950  (Institute  for  Social  Research). 

2.  Finestone,  Samuel,  Irving  F.  Lukoff,  and  Martin  Whiteman.  The  Demand  for 

Dog  Guides  and  the  Travel  Adjustment  of  Blind  Persons.  New  York:  Columbia 
University,  1960  (Research  Center,  The  New  York  School  of  Social  Work). 

3.  Hurlin,  Ralph  G.  “Estimated  Prevalence  of  Blindness  in  the  United  States  and 

in  Individual  States,  1960,”  Sight-Sav.  Rev.,  Vol.  42,  No.  1  (1962). 

4.  Impairments  by  Type,  Sex,  and  Age,  United  States,  July  1957 — June  1958.  (U.  S. 

National  Health  Survey.)  Washington,  D.  C.:  U.  S.  Department  of  Health, 
Education,  and  Welfare,  1959  (Public  Health  Service,  Division  of  Public 
Health  Methods). 

5.  Older  Persons,  Selected  Health  Characteristics,  United  States,  July  1957 — June 

1959.  (U.  S.  National  Health  Survey.)  Washington,  D.  C.:  U.  S.  Department 
of  Health,  Education,  and  Welfare,  1960  (Public  Health  Service,  Division  of 
Public  Health  Methods). 


THE  RESEARCH  AND  DEMONSTRATION 


GRANTS  PROGRAM  OF  THE  OFFICE 
OF  VOCATIONAL  REHABILITATION 

STEPHEN  P.  QUIGLEY 

Office  of  Vocational  Rehabilitation,  Department  of  Health,  Education,  and 
Welfare,  Washington,  D.  C. 


THE  DOMESTIC  PROGRAM 

The  purpose  of  the  research  and  demonstration  grants  program  in  the 
Office  of  Vocational  Rehabilitation  (OVR)  is  to  develop  new  knowledge 
and  new  techniques  and  to  validate  already  existing  methods  and  techniques 
of  rehabilitation  in  order  to  improve  and  extend  services  to  disabled  people. 
The  authority  for  this  program  is  provided  under  Section  4(a)(1)  of  the 
Vocational  Rehabilitation  Act  as  amended  by  the  83rd  Congress.  The 
program  is  administered  by  the  Division  of  Research  Grants  and  Demon¬ 
strations  under  the  direction  of  the  Assistant  Director  for  Research  and 
Training.  Under  the  program,  funds  are  supplied  for  the  partial  support  of 
projects  approved  by  the  Director  of  OVR.  An  advisory  body,  the  Na¬ 
tional  Advisory  Council  on  Vocational  Rehabilitation,  reviews  applications 
for  grants  and  makes  recommendations  for  action  to  the  Director. 

In  order  that  OVR  might  provide  the  best  technical  evaluation  and 
advice  to  the  Council,  study  sections  have  been  established.  These  study 
sections  review  applications  which  are  assigned  to  them  and  make  recom¬ 
mendations  to  the  Council.  The  recommendations  are  a  major  factor  con¬ 
sidered  by  the  Council  in  its  final  review  of  applications.  Reviews  provided 
by  staff  specialists  of  OVR  are  also  made  available  to  the  Council. 

At  present,  there  are  three  study  sections  in  operation  in  the  areas  of 
psychosocial,  sensory  disabilities,  and  medical  research.  Each  study  section 
is  composed  of  individuals  selected  for  their  specialized  knowledge  and 
training  in  research  design  and  in  the  field  of  disability  and  rehabilitation. 
Each  section  consists  of  a  chairman,  a  number  of  officially  appointed  mem¬ 
bers,  and  an  executive  secretary  who  is  a  professional  member  of  the  staff  of 
OVR.  The  chairman  of  each  section  is  named  by  OVR. 


497 


498  Plenary  Session 

OVR  Administrative  Procedures 

When  applications  are  received  in  the  Division  of  Research  Grants  and 
Demonstrations,  they  are  assigned  a  number  and  reviewed  by  the  staff  for 
completeness  and  accuracy.  They  are  then  assigned  to  the  appropriate  study 
section.  Assignments  are  made  by  the  executive  secretaries  of  the  study 
sections  and  other  staff  members  of  the  Division  of  Research  Grants  and 
Demonstrations  under  the  direction  of  the  Chief  of  the  Division.  In  some 
cases,  applications  may  be  assigned  to  more  than  one  study  section;  but 
one  study  section  always  has  primary  responsibility  for  recommending 
action  on  the  project.  After  the  study  section  assignments  have  been  made 
the  applications  are  duplicated  and  distributed  to  the  executive  secretaries 
who,  in  turn,  distribute  them  to  the  study  section  members.  Each  member 
receives  a  copy  of  every  application  assigned  to  his  study  section.  In  order 
to  facilitate  the  review  process,  however,  two  members  are  designated  by 
the  executive  secretary  as  the  principal  reviewers  on  each  application.  This 
is  for  the  purpose  of  economy  of  time  only  and  all  members  participate  in 
the  review  of  each  project.  The  applications  are  distributed  to  the  members 
several  weeks  in  advance  of  their  meetings  in  Washington.  These  meetings 
are  held  three  times  a  year,  approximately  six  weeks  in  advance  of  the 
meetings  of  the  National  Advisory  Council.  Arrangements  for  the  meetings 
and  for  the  travel  and  reimbursement  of  the  members  are  made  through 
the  executive  secretaries. 

Study  Section  Procedures 

The  study  sections  have  been  assigned  responsibility  for  (1)  the  tech¬ 
nical  review  of  applications  for  research  and  demonstration  grants  in  their 
respective  fields  and  (2)  for  surveying,  as  scientific  leaders,  the  status  of 
research  in  their  fields  in  order  to  determine  areas  in  which  research  activ¬ 
ities  should  be  initiated  or  expanded. 

Review  of  Applications.  At  the  study  section  meetings  the  chairman  intro¬ 
duces  each  project.  If  a  study  section  member  is  included  in  the  list  of 
personnel,  or  if  the  project  is  submitted  from  his  institution,  he  at  this  time 
leaves  the  room  for  the  duration  of  the  discussion  of  this  application.  Any 
study  section  member  may  also  be  excused  from  the  discussion  of  any 
project  at  his  own  request.  The  executive  secretary  next  presents  relevant 
background  information  collected  from  the  OVR  staff  and  other  sources, 
including  the  proposed  budget.  The  two  members  assigned  to  the  project 
then  give  their  comments  and  this  is  followed  by  general  discussion  by  the 
study  section  members. 


The  OVR  Grant  Program  499 

On  the  basis  of  their  review,  the  study  section  presents  comments  and 
recommendations  on  each  project  for  the  Council’s  consideration.  The 
various  recommendations  which  can  be  made  are: 

1)  Approval  of  the  study  as  presented  for  the  budget  and  period  of 
time  requested. 

2)  Approval  of  the  proposal  with  suggested  modifications  in  study 
design  or  in  the  budget  or  period  of  time.  (Increases  or  decreases 
in  budget  or  time  may  be  recommended. ) 

3)  Disapproval  if  the  proposal  submitted  is  not  acceptable  to  study 
section  members. 

4)  Disapproval  with  advice  to  resubmit  (if  the  proposal  submitted  is 
not  acceptable  but  the  study  section  feels  something  worthwhile 
might  be  developed) . 

5)  Deferral  of  the  proposal  (either  for  additional  information  or  for 
a  project  site  visit). 

6)  Conditional  approval  by  study  section  members  (based  upon  clari¬ 
fication  of  questions  raised  by  the  study  section  or  staff  negotiation) . 

When  a  site  visit  is  recommended  two  study  section  members  and  the 
executive  secretary  of  the  study  section  generally  constitute  the  site  visit 
team.  The  selection  of  members  for  the  site  visit  team  may  be  made  at  the 
time  of  the  study  section  meeting  or  at  a  later  time  by  the  chairman  and 
executive  secretary.  Site  visits  usually  are  made  to  obtain  information  which 
cannot  readily  be  obtained  by  telephone  or  correspondence.  This  includes 

(a)  discussing  the  conception  and  methodology  of  the  proposal  and  advising 
the  applicant  on  improvements  which  might  be  made  in  the  project  design, 

(b)  evaluation  of  project  personnel  whose  competence  is  unknown,  (c) 
evaluation  of  the  adequacy  of  the  facilities  for  conducting  the  project,  and 
(d)  obtaining  detailed  information  on  procedures  of  a  technical  nature  in¬ 
cluding  budget  items.  Whenever  possible  site  visits  are  made  prior  to  the 
Council  meeting  and  the  recommendations  of  the  study  sections  are  con¬ 
verted  to  approval  or  disapproval  on  the  basis  of  the  site  visit  report.  When 
time  permits  the  study  section  members  indicate  their  recommendations  by 
mail  vote. 

Conditional  approval  is  recommended  when  minor  criticisms  have  been 
raised  which  can  be  answered  by  telephone  or  mail  communication.  The 
executive  secretary  and  the  study  section  members  who  wish  clarification 
of  items  are  appointed  to  seek  the  required  information  from  the  applicant. 

Technical  criteria  used  by  the  study  sections  in  evaluating  a  project  are 
as  follows : 


500  Plenary  Session 

Nature  of  the  problem:  Is  the  problem  important  and  relevant  to  voca¬ 
tional  rehabilitation?  Will  it  contribute  significantly  to  the  understanding  or 
treatment  of  disability  and  the  vocational  rehabilitation  of  the  disabled?  Is 
it  sufficiently  delimited  to  be  answerable  by  the  proposed  techniques  of  in¬ 
vestigation  and  with  the  resources  available  to  the  investigator? 

Rationale:  The  problem  should  be  presented  within  a  framework  of 
ideas  relating  it  to  the  general  body  of  theory  or  of  service  considerations 
from  which  it  arose. 

Review  of  the  literature:  There  should  be  adequate  coverage  and  inter¬ 
pretation  of  relevant  preceding  studies  showing  how  they  relate  to  the  present 
problem. 

Population  used:  The  population  should  be  clearly  defined  and  the 
nature  of  the  sample  used  should  be  discussed.  The  sample  should  be 
sufficiently  large  and  representative  to  permit  generalization  to  other 
relevant  situations. 

Methods  of  investigation:  These  should  be  clearly  appropriate  for  in¬ 
vestigating  the  problem.  The  procedures  should  be  as  objective  as  possible 
and  repeatable  by  other  investigators.  If  a  demonstration  of  behavioral 
improvement  is  to  be  made  there  should  be  adequate  methods  of  evaluating 
behavior  before  and  after  training  or  other  treatment.  Criteria  of  improve¬ 
ment  should  be  objective  and  specifically  stated.  If  mental  tests  already 
available  are  to  be  used  they  should  be  designated  as  specifically  as  pos¬ 
sible.  If  tests  must  be  constructed  for  use  on  the  project  the  feasibility  of 
this  procedure  should  be  indicated.  When  it  is  appropriate  to  use  control 
groups  the  method  of  constructing  them  should  be  given  and  should  be 
practical  (for  example,  not  requiring  matching  on  an  impossible  number 
of  variables).  Procedures  for  analyzing  data  should  be  indicated  clearly 
enough  to  show  that  they  will  answer  the  questions  posed  by  the  research. 

Personnel:  The  project  staff  should  be  adequate  in  numbers  and  quality 
to  complete  all  planned  phases  of  the  project  in  the  time  allotted.  Profes¬ 
sional  personnel  should  have  a  sufficient  amount  of  training  and  experience. 

Budget:  These  items  should  be  realistic,  neither  too  meager  nor  overly 
ambitious.  Expenses  should  be  clearly  relevant  to  the  project.  For  example, 
travel  times  should  be  shown  to  be  for  the  specific  purpose  of  collecting 
data,  obtaining  necessary  information  from  other  research,  or  attending 
professional  meetings  relevant  to  the  project. 


The  OVR  Grant  Program  501 

Following  the  review  of  projects,  administrative  matters  relating  to 
the  study  sections  are  considered.  Site  visit  members  and  times  are  dis¬ 
cussed.  The  study  sections  decide  on  the  dates  for  the  next  meeting.  An 
effort  is  made  to  meet  at  least  six  weeks  before  the  Council  meeting  so 
that  sufficient  time  may  be  allowed  for  site  visits  and  distribution  of  ma¬ 
terials  to  the  Council. 

Status  of  Research.  The  second  responsibility  of  the  study  sections  is 
that  of  surveying  the  status  of  research  in  their  fields  in  order  to  determine 
areas  in  which  research  activities  should  be  initiated  or  expanded  and  to 
suggest  ways  in  which  this  might  be  accomplished.  A  part  of  each  meeting 
often  is  devoted  to  this.  The  expansion  of  research  in  a  given  area  can 
be  advanced  through  the  efforts  of  individual  study  section  members  to 
stimulate  and  encourage  young  investigators  and  through  the  sponsorship 
of  workshops  and  conferences  devoted  to  specific  research  needs. 

Council  Procedures 

The  National  Advisory  Council  on  Vocational  Rehabilitation  is  the 
reviewing  body  charged  with  the  responsibility  of  making  final  recom¬ 
mendations  for  action  on  applications  to  the  Director  of  the  Office  of  Vo¬ 
cational  Rehabilitation.  In  discharging  this  responsibility  the  Council  draws 
upon  all  the  sources  of  information  available  to  it.  The  major  sources  of 
information  are  (a)  the  recommendations  from  the  various  study  sections, 
(b)  the  comments  from  the  staff  of  the  OVR,  (c)  comments  from  selected 
experts  in  many  areas,  and  (d)  the  individual  and  collective  knowledge 
and  experience  of  the  Council  members.  The  recommendations  of  the 
study  sections,  the  OVR  staff,  and  outside  experts  are  transmitted  to  the 
Council  along  with  a  summary  of  each  project  several  weeks  before  the 
Council  meetings.  The  summary  and  recommendations  are  prepared  by 
the  executive  secretaries  of  the  study  sections. 

The  Council  reviews  applications  against  a  background  of  respon¬ 
sibilities  broader  than  those  of  the  study  sections.  These  include  (a)  the 
needs  of  the  Office  of  Vocational  Rehabilitation,  (b)  the  national  needs 
in  various  areas  of  rehabilitation,  and  (c)  matters  of  policy  which  need  not 
concern  the  study  section.  The  actual  operating  procedures  of  the  Council 
are  similar  to  those  of  the  study  section  and  the  final  product  is  a  series  of 
recommendations  to  the  Director  of  the  Office  of  Vocational  Rehabilitation 
on  the  applications  reviewed.  These  recommendations  are  a  major  guide 
for  the  Director  in  making  the  final  determination  on  applications. 


502  Plenary  Session 

The  Research  Sequence  in  the  OVR 

The  matters  I  have  discussed  thus  far  relate  mostly  to  the  review  process 
in  our  research  and  demonstration  program.  I  have  avoided  up  to  this 
time  any  attempt  to  differentiate  among  the  various  classes  of  projects 
which  we  consider  for  support.  The  major  problem  here  is  in  differentiating 
research  projects  from  demonstration  and  selected  demonstration  projects 
which  are  a  very  important  part  of  our  program.  In  general,  we  might 
say  that  the  two  types  of  demonstration  projects  are  the  major  tool  used 
by  OVR  in  applying  research  findings  in  services  to  people.  Thus  our  re¬ 
search  sequence  consists  of  (a)  research  projects  aimed  at  the  derivation 
of  general  principles,  procedures,  tests,  or  other  products  which  can  be 
applied  in  a  variety  of  service  situations;  (b)  demonstration  projects  aimed 
at  the  establishment  and  evaluation  of  improved  services  through  use  of 
research  findings  or  of  other  available  general  knowledge;  (c)  selected 
demonstration  or  establishment  of  pioneer  services  in  various  appropriate 
settings  according  to  prototypes  derived  from  the  general  demonstration 
projects.  This  progression  provides  a  place  for  both  basic  and  applied 
research  and  assures  as  well  as  possible  that  basic  research  contributions 
will  be  of  use  to  the  sponsoring  agency. 

The  demonstration  project  can  be  conceptualized  as  the  application  in 
a  practical  setting  of  information,  derived  from  either  fundamental  research 
or  from  experience  in  life  situations,  for  the  purpose  of  determining  whether 
this  knowledge  or  experience  is  actually  applicable  in  the  practical  set¬ 
ting  chosen.  Since  demonstrations  occupy  a  middle  position  in  the  research 
sequences  outlined,  they  partake  of  the  characteristics  of  both  general  in¬ 
vestigations  and  specific  services.  To  omit  either  aspect  is  to  greatly  weaken 
the  contribution  which  the  demonstration  project  can  make.  With  good  and 
sufficient  criteria  the  demonstration  can  be  held  to  as  high  standards  of 
excellence  as  can  a  basic  research  study.  Several  criteria  are  of  particular 
importance. 

Novelty 

The  demonstration,  of  course,  presents  a  type  of  service,  but  one  which 
is  conceived  of  as  being  new  and  in  some  resects  better  than  similar  ones 
in  current  use.  Consequently  the  first  criterion  is  the  novelty  of  the  pro¬ 
cedure  offered  or  of  its  use  in  a  hitherto  unexplored  setting.  The  procedure 
must  be  described  with  sufficient  detail  and  clarity  so  that  its  special  features 
can  be  well  understood  and  differentiated  from  related  methods.  If  the 
claim  to  novelty  is  the  setting  used  the  general  implications  of  the  tryout  in 


The  OVR  Grant  Program  503 

such  a  location  should  be  made  clear.  Other  relevant  investigations  and 
services  completed  or  in  progress  should  be  discussed.  A  rationale  for  the 
investigation  should  be  presented.  That  is,  it  should  be  placed  in  a  general 
ideational  framework  utilizing  past  research  or  experience  which  has  led 
to  the  development  of  the  present  plan.  Hypotheses  as  to  the  effective  ele¬ 
ments  in  the  new  methods  are  here  involved. 

Evaluation 

There  must  be  a  plan  for  systematic  evaluation  of  the  effectiveness  of 
the  proposed  procedure.  This  involves  careful  conceptualization  of  the 
possible  outcomes  of  the  methods  employed  and  development  of  reliable 
methods  of  evaluating  subjects  in  relation  to  these  outcomes  at  the  begin¬ 
ning  and  end  of  treatment  and  at  strategic  intermediate  points.  Outcomes 
should  be  carefully  analyzed  whenever  possible  to  reveal  a  differentiated 
measure  of  improvement  in  all  relevant  aspects  rather  than  one  over-all 
measure  of  betterment.  The  elements  of  the  profile  should  evolve  out  of 
the  rationale  of  the  project,  deriving  from  hypotheses  as  to  relationships 
between  procedural  factors  and  behavioral  outcomes.  If  intergroup  or  in¬ 
terpersonal  interaction  is  considered  a  part  of  the  efficacy  of  the  proposed 
procedure  there  should  be  some  provision  for  systematic  sampling  of  this 
interaction  at  successive  stages  of  the  demonstration  so  that  the  process  of 
improvement  of  individuals  can  be  followed  and  understood. 

“Generalizability” 

The  project  should  demonstrate  procedures  which  are  practical  and 
meaningful  in  situations  normally  encountered.  Results  should  not  depend 
on  personnel,  equipment,  or  services  which  are  necessarily  unique  to  certain 
locations.  Methods  should  be  analyzable  in  a  specific  objective  manner  so 
that  they  can  be  communicated  to  and  reproduced  by  qualified  individuals. 

Desirability 

The  significance  of  the  work  for  the  community  should  be  clear.  An 
otherwise  well-planned  project  may  nevertheless  propose  a  procedure  less 
advantageous  than  those  already  existing  or  under  development.  An  in¬ 
vestigation’s  “novelty”  might  even  at  times  represent  an  unintentional  return 
to  a  type  of  program  previously  discarded. 

If  the  demonstration  conforms  acceptably  to  the  criteria  which  have 
been  stated  its  results  can  easily  be  cast  in  a  form  suitable  for  use  in  a 
large  class  of  practical  situations.  The  study  of  process  already  mentioned 


504  Plenary  Session 

can  also  lead  to  hypotheses  for  further  basic  research.  Thus  research  and 
demonstration  have  a  reciprocal  effect  on  each  other. 

The  Office  of  Vocational  Rehabilitation  supports  a  wide  program  of 
demonstration  projects  in  most  major  disabilities.  When  one  of  these 
projects  shows  unusual  success  in  filling  a  service  need  for  disabled  people 
it  may  be  selected  as  a  prototype  for  inclusion  in  the  selected  demon¬ 
stration  program.  The  initial  demonstration  project  then  becomes  a  model 
on  which  similar  projects  can  be  based  and  thus  help  spread  the  service 
throughout  the  country.  Thus,  the  research  and  demonstration  program 
of  the  OVR  through  research  projects,  demonstration  projects,  and  selected 
demonstration  projects  provides  support  for  the  total  research  process  from 
initiation  to  application. 

THE  INTERNATIONAL  PROGRAM 
Research 

Since  August  1960  the  Office  of  Vocational  Rehabilitation  has  conducted  a 
program  of  financial  support  for  the  conduct  of  rehabilitation  research 
abroad,  with  American-owned  foreign  currencies  accumulated  from  the 
sale  in  foreign  countries  of  agricultural  surpluses  under  Public  Law  480. 
For  F.Y.  1961,  $930,000  was  made  available  for  this  program;  $1,372,000 
for  F.Y.  1962,  and  $2,000,000  was  included  in  the  President’s  budget  re¬ 
quest  for  F.Y.  1963.  The  countries  involved  in  the  cooperative  program 
are  Brazil,  Burma,  India,  Indonesia,  Israel,  Pakistan,  Poland,  UAR-Egypt, 
Syria,  and  Yugoslavia.  To  date  25  projects  have  been  approved  and  an 
additional  25  are  in  the  process  of  development. 

The  purpose  of  the  program  is  to  assist  research  projects  abroad  which 
(a)  will  lead  to  the  development  of  new  knowledge  and  techniques  for 
eliminating  or  reducing  the  handicapping  effects  of  disability  or  (b)  will 
provide  new  application  of  existing  knowledge  and  techniques  to  rehabili¬ 
tation  problems.  Projects  should  produce  results  of  mutual  benefit  to  re¬ 
habilitation  in  the  United  States  and  in  the  country  in  which  the  project 
is  carried  out. 

Projects  may  deal  with  specific  disabilities  or  combinations  of  dis¬ 
abilities.  They  may  deal  with  the  solution  of  problems  of  disability  looking 
to  maximum  self-sufficiency  or  productivity  of  the  disabled.  Some  of  the 
types  of  projects  which  might  be  carried  out  under  this  program  are: 

1)  Investigation,  analysis,  and  evaluation  of  techniques  for  rehabilita¬ 
tion  of  the  severely  disabled,  involving  problems  of  blindness,  of  deafness, 


The  OVR  Grant  Program  505 

of  Hansen’s  disease  (leprosy),  of  severely  disabled  persons  living  in  rural 
areas,  and  other  similar  problems  of  the  severely  disabled. 

2)  Studies  of  employment  problems  and  opportunities  for  the  handi¬ 
capped,  related  to  prevailing  industrial  and  agricultural  patterns,  and  to 
the  development  of  both  regular  competitive  employment  and  to  special 
workshop  employment. 

3)  Research  in  prosthetic  appliances  and  orthotic  devices  (artificial 
limbs  and  braces,  and  other  aids)  including  projects  where  prosthetics  lab¬ 
oratories  and  other  fabricators  are  developing  the  use  of  local  materials, 
of  new  and  different  materials,  of  new  design,  or  of  new  methods  of  training 
the  prosthetics  wearer  in  its  use. 

4)  Pilot  studies  or  experimental  attempts  to  provide  improved  re¬ 
habilitation  services,  for  the  purpose  of  testing  or  establishing  standards  or 
methods  of  service  that  are  practicable  and  effective  for  general  application 
in  the  rehabilitation  effort;  and  pilot  projects  that  provide  a  special  type 
of  rehabilitation  service  in  order  to  test  its  value  in  rehabilitation  and  to 
provide  information  on  costs,  methods  of  administration,  methods  of  pro¬ 
viding  services,  or  rehabilitation  techniques. 

Research  projects  may  also  provide  for  the  bringing  together  of  scien¬ 
tists  and  investigators  to  plan  programs  for  research  designed  to  raise  the 
level  of  rehabilitation  knowledge  and  techniques  and  for  increasing  com¬ 
petence  and  experience  in  rehabilitation  research. 

Interchange  of  Rehabilitation  Personnel 

In  September  1961  the  Office  of  Vocational  Rehabilitation  began  to  imple¬ 
ment  the  additional  international  research  authorities  provided  by  Public  Law 
86-610,  the  International  Health  Research  Act.  Under  Section  4  the  OVR 
is  authorized,  among  other  things,  to  arrange  for  the  interchange  between 
the  United  States  and  participating  foreign  countries  of  scientists  and  ex¬ 
perts  engaged  in  rehabilitation  research.  Pursuant  to  this  authority,  a  group 
of  12  specialists  in  plastic  surgery  has  been  organized  to  go  to  the  Christian 
Medical  College  and  Hospital  in  Vellore,  India,  in  rotation,  to  advise  and 
help  build  a  broad  program  for  the  rehabilitation  of  lepers.  The  work  of 
these  specialists  ties  in  directly  with  a  rehabilitation  research  project  pre¬ 
viously  approved  under  Public  Law  480  by  the  Office  of  Vocational  Re¬ 
habilitation.  Four  of  these  surgeons  have  already  gone  to  India  on  this 
program. 

Two  United  States  experts  in  prosthetics  and  orthotics  were  sent  to 


506  Plenary  Session 

Yugoslavia  in  1962  to  work  for  two  months  with  outstanding  orthopedic 
physicians  and  physiatrists  in  Yugoslavia  on  the  problems  of  design,  manu¬ 
facture  of  components,  and  the  principles  of  fitting  and  alignment.  Through 
their  efforts  newer  materials  of  lighter  weight  and  greater  strength  were 
developed. 

Development  of  Project  Proposals 

The  Office  of  Vocational  Rehabilitation  will  be  glad  to  correspond  with 
representatives  of  organizations  who  wish  to  explore  informally  whether  a 
project  idea  comes  within  the  terms  of  this  program.  Projects  finally  sub¬ 
mitted  as  outlined  in  the  Guide  will  be  reviewed  by  the  OVR.  If  eligible 
for  support,  funds  will  be  made  available  under  the  terms  of  individual 
project  agreements  entered  into  by  the  Office  of  Vocational  Rehabilitation, 
Department  of  Health,  Education,  and  Welfare  and  the  institution,  agency, 
or  organization  within  the  cooperating  country. 


STIMULATION  OF  RESEARCH  IN 
TECHNOLOGY  AND  BLINDNESS 


MILTON  D.  GRAHAM 

American  Foundation  for  the  Blind,  New  York,  New  York 


I  have  some  fairly  astringent  remarks  to  make,  in  the  hope  that  this  will 
prompt  some  discussion.  I  think  we  need  some  straightforward  language 
at  this  Congress;  my  hope  is  that  we  will  be  realistic  about  what  comes 
after.  If  we  are  realistic  we  can  realize  some  of  the  limited  goals  of  some 
of  us  concerned  with  follow-up. 

The  stimulation  of  research  is  my  focus.  I  may  go  afield  a  bit,  but  I 
hope  you  will  understand  that  my  only  concern  is  to  see  that  more  and 
better  research  is  done.  There  are  obstacles  to  overcome;  I  would  like 
to  name  some  of  those  obstacles;  I  would  like  to  name  some  of  the  hopes 
we  have  as  well. 

It  has  been  pretty  much  the  sense  of  these  meetings  that  we  need  more 
fundamental,  multidisciplinary  research  of  high  caliber.  I  say  high  caliber 
for  this  reason:  in  reviewing  our  activities  at  the  Foundation  not  long  ago, 
I  estimated  that  Mr.  Dupress  and  I  have  probably  been  asked  to  evaluate 
or  to  comment  formally  on  some  50  research  proposals  to  various  organi¬ 
zations,  government  and  otherwise.  I  estimated  that  of  those  50  we  were 
negative  in  our  evaluation  of  about  40  of  them.  This  didn’t  necessarily  mean 
that  they  weren’t  funded,  I  might  add.  Looking  back  upon  these  proposals, 
we  felt  what  we  had  done  was  to  set  up  some  standards  of  research  excel¬ 
lence  and  of  research  methodology  that  the  field  needed.  We  were  also 
leaving  ourselves  open  to  the  charge  that  we  were  unsympathetic  to  the 
object  of  some  of  the  research  proposed — but  we  took  that  risk:  there 
should  be  no  compromise  on  design. 

Within  my  experience,  the  research  that  has  been  done,  that  is  being 
done,  or  that  has  intended  to  be  done  on  sensory  deprivation,  has  had 
the  very  best  of  intentions.  This  doesn’t  necessarily  mean,  however,  that 
the  methodology  is  sound.  I  think  we  have  a  right  to  demand  that  the  re¬ 
search  methodology  be  sound  regardless  of  the  subject.  I  think  in  this  field, 
particularly,  we  should  insist  it  be  very  sound.  We  have  enough  to  do  in 


507 


508  Plenaiy  Session 

providing  reliable  data  to  a  field  that  abounds  in  misconceptions;  we  cannot 
afford  to  do  it  sloppily.  For  this  reason  I  think  high  standards  are  im¬ 
perative.  It  would  be  better  not  to  do  any  research  at  all  than  to  have 
poor  research.  The  field  will  not  be  well  served  by  it. 

We  have  yet  another  problem.  We  have  enlarged  the  scope  of  our 
interests.  Indeed  this  meeting  is  an  expression  of  that  enlargement  of  scope. 
We  can  see  that  other  people,  outside  the  traditional  disciplines  that  have 
contributed  to  the  welfare  of  the  blind,  can  contribute  now  perhaps 
markedly.  We  have  made  an  attempt  to  solicit  help  from  other  disciplines 
that  have  not  been  traditionally  interested  in  the  welfare  of  the  blind.  I 
think  we  have  found  it  a  most  satisfying  search:  we  believe  these  disciplines 
have  an  important  contribution  to  make.  I  think  it  is  up  to  us  in  this  field 
to  continue  keeping  up  with  the  work  of  those  who  will  not  continue  to  be 
in  work  directly  associated  with  the  blind,  and  who  can’t  be  expected  to 
get  into  that  work,  but  whose  work  should  be  watched  very  carefully  by 
us  for  whatever  knowledge  and  information  we  can  get  from  it.  I  put  the 
responsibility  on  our  shoulders,  in  effect,  to  keep  aware  of  what  is  going 
on  in  fields  that  are  traditionally  outside  our  interests.  I  would  like  to  think 
in  time  that  we  can  enlarge  our  scope  so  these  other  fields  of  research 
would  not  be  outside  the  traditional  field:  I  would  like  to  broaden  the 
tradition.  But  it  is  not  realistic  at  the  moment  to  expect  other  very  busy, 
very  energetic,  and  interested  people — interested  in  us — to  take  much  time 
with  this  expansion  of  our  scope.  For  a  while — at  least  until  we  see  the 
need  for  enlarging  our  traditional  scope — we  must  take  the  initiative  our¬ 
selves. 

Another  point:  in  this  enlargement  of  our  traditional  scope  we  neces¬ 
sarily  involve  multidisciplinary  interests  and  multidisciplinary  research. 
This  is  a  fine  sounding  and  much  overused  word,  multidisciplinary  (in¬ 
terdisciplinary),  but  I  don’t  know  any  better  word.  I  would  like  to  take 
this  word  and  discuss  what  it  means  so  far  as  practical  research  adminis¬ 
tration  goes.  I  spend  a  great  deal  of  my  time  in  the  administrative  task 
of  shuffling  papers,  and  I  spend  a  great  deal  of  my  time  getting  other  people 
to  take  on  the  research  tasks  that  we  think  are  important  to  our  field — 
that  is,  stimulating  research.  I  know  how  difficult  it  is  to  interest  people 
in  our  work  if  we  approach  them  with  the  attitude  that  they  might  “help 
the  poor  blind.”  Most  research  people  have  no  use  for  such  an  approach. 
We  get  much  better  results  when  we  can  approach  them  in  the  sense  of 
“Here  is  a  complex,  difficult  problem  for  us;  it  seems  as  if  you,  as  a  sci¬ 
entist,  with  your  accomplishments,  might  be  interested  in  some  of  the 


Stimulation  of  Research  509 

complexities  of  our  problems;  do  you  have  any  interest;  can  you  offer  us 
any  help?”  On  the  problem  approach  basis,  we  have  had  a  great  deal  of 
very  valuable  help.  This  approach  implies,  however,  when  three  or  four 
disciplines  cooperate,  that  if  there  is  not  an  integrated  team  already  in  ex¬ 
istence,  we  find  ourselves  in  the  role  of  playing  emissary  between  the 
factions,  the  disciplines,  the  offices,  or  whatever  of  the  various  members  of 
the  group.  This  is  a  very  difficult  role.  I’m  saying  here  what  I  think  every¬ 
one  in  research  must  know:  that  the  word  ‘'interdisciplinary”  is  very  easy  to 
say,  and  very  difficult  to  accomplish.  It  is  a  part  of  our  job  to  try  to  help 
smooth  out  differences,  to  help  establish  communication  lines  across  dis¬ 
ciplines,  to  bridge  the  gap  across  various  groups.  It  is  not  an  easy  job,  and 
at  the  Division  of  Research  of  the  American  Foundation  for  the  Blind  we 
are  not  sure  how  well  we  do,  but  we  are  continuing  to  try.  We  hope  that  in 
time  we  can  put  across  an  approach  we  call  the  “program  approach,”  that 
is,  a  coordinated  long  range  statement  of  our  aims,  goals,  needs,  and  ac¬ 
complishments  that  will  make  sense  to  any  researcher  consulting  it. 

For  too  long  research  in  our  field  has  gone  on  a  project-to-project  basis: 
one  project  group  doesn’t  know  what  another  is  doing,  and  cares  less.  They 
may  not  relate  to  each  other;  they  may  or  may  not  add  up  together  to  some 
of  the  possible  solutions  to  some  of  the  important  problems  that  blind  peo¬ 
ple  have;  they  may  even  be  concerned  with  unimportant  problems.  I 
maintain  that  this  is  wasteful  and  inefficient;  that  we  should  know  what  these 
main  problems  are;  that  we  should  attack  them  systematically;  and  we 
should  set  up  lines  of  communication  between  any  one  particular  project  and 
other  related  ones — or  among  all  projects,  if  we  are  ultimately  to  make  our 
research  findings  meaningful  to  persons  concerned  with  visual  and  other 
sensor}7  impairments.  I  am  sorry  to  say  that  in  my  opinion  the  grant 
mechanism  that  is  set  up  in  the  Federal  government  in  our  particular  field 
doesn't  encourage  this  point  of  view.  I  think  in  time  it  will. 

In  the  “hard’’  or  “exact'’  sciences  there  has  had  to  be  a  program  ap¬ 
proach,  from  the  standpoint  of  national  survival.  Hence,  centers  have  been 
subsidized,  and  the  large  problems  in  nuclear  physics,  aerospace  science,  and 
other  complex  areas  have  been  undertaken.  I  think  this  may  be  a  model  of 
approach  for  us.  Entirely  too  much  time  is  spent  by  researchers  running 
among  groups  of  people  finding  out  what  they  are  doing.  If  they  could  sit 
and  work  in  one  office  or  one  building  where  almost  everyone  concerned 
happened  to  be,  I  think  a  great  deal  more  work  would  be  done.  It  certainly 
could  be  no  less  productive  than  the  present  system. 

I  hope  the  program  approach  will  eventually  become  established  in  our 


510  Plenary  Session 

field.  This  would  help  those  of  us  who  have  to  deal  with  many  varying  dis¬ 
ciplines,  peoples,  groups,  and  kinds  of  funding.  To  be  realistic  I  have  to 
acknowledge  one  criticism  raised  about  it  that  is  not  at  all  true,  even  if  it  is 
persistent.  This  is  the  claim  that  a  program  approach  means  some  one  person 
or  group  must  formulate  it  and,  therefore,  by  definition,  it  is  dictatorial  and 
inflexible:  one  group  will  dominate  all  research  in  the  field.  I  think  this  is 
ridiculous,  and  I  don’t  believe  anyone  who  raises  this  complaint  really 
knows  the  temperament  of  research  people.  Try  to  dominate  most  of  them! 
They  have  to  maintain  their  professional  integrity  at  all  times  before  the  wary 
eyes  of  their  fellow  researchers.  So  I  conceive  a  program  as  formulated  by 
a  voluntary  association  of  like-minded  specialists,  who  know  when  to  agree 
and  when  to  disagree.  Such  people  are  brought  together  to  discuss,  in  the 
light  of  their  own  disciplines  and  training,  some  recommendations,  thoughts, 
and  ideas  toward  the  solution  of  common  problems. 

The  alternative  to  the  program  approach  is  what  we  are  doing  now  in 
the  research  and  development  on  sensory  impairment.  I’d  like  to  give  you 
a  couple  of  painful  examples  of  the  funding  of  research  in  the  past  that  have 
lessons  for  the  future. 

I  had  the  opportunity  to  see  happen  a  process  that  might  be  called  con¬ 
fusing  definitions  or  “falling  between  stools.”  Blindness  imposes  several 
kinds  of  problems — in  addition  to  the  fact  that  blind  people  happen  to  be 
people,  and  people  normally  have  a  host  of  complex  problems.  Any  single 
approach  or  single  project,  it  seems  to  me,  that  thinks  it  has  one  answer 
for  the  lot  serves  no  one  well.  Consider  the  case  of  some  children,  severely 
emotionally  disturbed  and  blind,  and  usually  multiply  handicapped.  (Those 
of  you  who  have  seen  the  film  or  play  of  The  Miracle  Worker  might  be 
interested  to  know  that  one  of  the  little  girls  in  this  group  with  which  I  was 
concerned  was  the  model  for  the  stage  mannerisms  of  Patty  Duke.)  When 
it  came  time  to  obtain  federal  support  for  the  research  into  methods  of 
therapy  that  might  benefit  these  children  and  to  work  with  their  families  one 
funding  group  said,  “We’re  sorry,  that’s  a  mental  or  emotional  problem  and 
is  not  our  concern.”  The  other  group  to  whom  it  was  then  referred  said, 
“Sorry,  that’s  a  ‘blind’  problem  and  not  our  concern.”  It  took  us  a  year  and 
a  half  before  we  finally  got  the  two  parties  together  to  agree  under  whose 
purview  the  problem  came,  and  to  get  the  project  funded. 

There  has  been  interest  in  the  past,  I  know,  among  some  government 
offices  in  setting  up  some  kind  of  systematic  mutual  exchange  of  funding 
and  ongoing  research  information.  This  example  I’ve  mentioned  happened 
to  be  within  parts  of  the  same  office,  but  it  can  happen  between  offices  as 


Stimulation  of  Research  511 

well.  The  amount  of  energy  that  is  expended  on  coordination  outside  of  the 
funding  agencies  in  order  to  get  funds  is,  I  think,  excessive;  the  system  should 
be  changed  to  allow  better  coordination  within  government  offices,  and  be¬ 
tween  government  offices.  We  had  once  made  a  recommendation  to  a 
Congressional  committee  that  perhaps  in  our  held  we  needed  an  interagency 
organization  that  had  access  to  several  agencies  for  the  funding  of  research 
to  meet  our  problems;  one  agency  is  interested  often  in  only  one  topic  and 
takes  one  point  of  view,  while  another  agency  is  interested  only  in  another 
aspect  of  the  same  research  project,  and  so  on.  The  children  involved  in  the 
proposal  mentioned  received  some  therapy,  and  some  research  was  done — 
but  about  two  years  later  than  necessary,  by  a  staff  that  had  spent  too  much 
of  its  professional  time  merely  raising  funds. 

Some  readers  probably  think  I  am  being  extreme  when  I  say  that  the 
grant  mechanism  as  it  stands  now  is  antiquated  and  must  be  changed  if  we 
are  going  to  get  some  of  these  difficult  problems  solved.  We  feel  it  in  our 
held  particularly  because  we  are  a  marginal  held:  numerically,  we  are 
small.  Our  problems  are,  however,  among  the  most  complex  in  existence.  I 
think  that  in  order  to  stimulate  future  research,  some  look,  some  critical  look 
should  be  made  of  the  granting  mechanism.  The  natural  sciences,  and  now 
the  medical  sciences,  have  led  the  way  and  the  office  of  the  general  chairman 
of  this  Congress,  Dr.  Jerome  Wiesner,  has  stimulated  much  useful  discussion 
on  this  topic  of  the  program  approach. 

Another  example  of  funding  problems  had  a  happier  ending.  This  Con¬ 
gress  was  sponsored  by  hve  organizations.  We  had  no  special  difficulty  in 
obtaining  money  for  this  meeting,  particularly  when  we  laid  out  our  plan  and 
explained  it.  But,  because  it  was  an  expensive  venture  (there  was  a  great  deal 
of  expense  in  bringing  our  overseas  guests  here),  we  had  to  go  to  several 
organizations;  we  pieced  it  together,  in  other  words.  We  were  successful;  I 
don’t  regret  a  minute  of  the  time  that  was  spent. 

Other  piecings  together — some  funds  from  here  and  some  from  there — 
have  not  had  such  a  happy  outcome.  I  don’t  think  this  should  be  necessary. 
Perhaps  as  we  define  our  problems  better  and  educate  funding  organizations 
to  these  more  sophisticated  definitions  we  shall  spend  less  energy  on  the 
thankless  task  of  raising  several  small  sums  of  money  for  one  worthwhile 
endeavor. 

This  hope  is  very  pertinent  to  my  last  remark,  which  takes  us  into  the 
area  that  we  are  going  to  be  concerned  with  later  on — the  establishment  of 
an  information  center,  or  whatever  one  wants  to  call  it,  and  establishing 
lines  of  communication.  We  wanted  to  see  what  chances  were  of  establishing 


512  Plenary  Session 

some  means  whereby  we  could  keep  up  communications  after  this  Congress 
ended.  I  was  interested  in  some  of  the  reactions  that  I  might  get  from  possible 
funding  agencies.  I  went  to  something  like  five  or  six  different  agencies — 
none  of  them,  I  might  add,  were  represented  at  the  Congress.  One  said, 
“Well,  this  part  of  it  looks  like  our  interests”;  another  said,  “We  are  not 
interested  in  this  part,  but  that  part  might  be  it.”  It’s  the  old  piecing-together 
problem  again,  and  it  devolves  upon  the  proposer  to  run  the  shoals  of 
narrow  categorical  interests. 

Now  perhaps  we  will  be  successful  in  setting  up  an  information  center, 
if  we  decide  to  do  it,  and  if  we  get  some  sentiment  to  the  effect  that  it  is 
worth  the  effort.  It  unfortunately  does  not  seem  to  be  enough  that  the  field 
has  a  clear  call  for  something  to  be  done.  That  “clear  call”  must  be  trans¬ 
posed  into  statements  that  will  be  reviewed  positively  by  persons  with  little 
or  no  knowledge  of  blindness  and  visual  impairment.  The  result  is  almost 
predictable:  failure  to  understand.  Let  me  illustrate  this  point.  Having 
worked  several  months  with  a  very  capable  civil  servant  for  whom  I  have 
great  respect  to  get  a  research  project  going — and  after  months  of  work¬ 
ing  with  his  staff  trying  to  phrase  the  proposal  for  lay  persons — the 
civil  servant  said  to  me,  “You  really  want  a  government  pension  for  the 
blind,  don’t  you?”  My  reply  was  straightforward:  In  that  remark,  there’s 
several  months  work  lost. 

There  are  other  hazards  in  trying  to  set  up  a  research  and  development 
program  on  visual  impairment.  There  are  people  who  have  said,  “Why 
do  you  need  to  go  to  the  government?  You  get  all  the  money  you  want.  Just 
parade  your  blind  people  out  with  their  dogs  and  their  canes  and  you  can 
get  all  the  money  you  want.”  This  is  what  some  people  feel.  If  they  are 
honest  enough  they’ll  tell  you  so;  others  may  think  it  without  saying  it;  still 
others  may  not  be  aware  that  they  think  such  things.  These  are  the  “hidden 
biases”  we  find  so  widespread.  I  think  that  we  have  a  good  deal  of  work  to  do 
in  trying  to  make  quite  clear  that  we  are  interested  in  research  that  concerns 
people.  We’re  not  interested  just  in  a  pair  of  eyes,  or  in  a  pair  of  ears,  or 
some  fingertips,  or,  as  some  people  think,  a  cripple  who  can  only  be  taken 
care  of  not  taught  to  be  independent — we’re  interested  in  human  beings, 
and  we’re  interested  in  good  research  that  will  help  human  beings.  I  think 
this  is  the  basic  point  of  view  we  should  make  explicit  to  funding  organiza¬ 
tions  and  researchers  alike.  I  believe  we  can  stimulate  research  in  this 
country  which  is  in  our  interest.  I  think  we  can  keep  up  the  essential  links 
of  communication. 

I  would  like  to  say,  in  closing,  that  we  at  AFB  intend  to  do  our  best  in 


Stimulation  of  Research  5 1 3 

keeping  up  with  at  least  American  research  and  development  in  our  field. 
We  will  try  to  do  our  best  here  for  the  blind  and  visually  impaired;  we  will 
make  every  effort  to  raise  funds  to  do  this  job;  and  you  can  rest  assured 
that  if  anyone  abroad  wants  to  associate  himself  with  us,  or  ask  our  help, 
or  give  us  help,  we  will  be  very  willing  to  accept  it.  We  intend  to  do  the  job 
as  best  we  can.  We  do  need  help,  of  the  kind  that  I  have,  I  think,  outlined 
here,  including  new  ideas,  imaginative  thinking,  and  better  definitions  of  our 
problems.  We  need  to  keep  up  with  and  improve  our  systems  of  communi¬ 
cation.  I  think  you  can  help  us;  I  hope  you  can  give  us  some  ideas  on  how 
we  might  do  better.  They  will  be  welcome. 


INDEX - VOLUME  I 


Active  Energy  Radiating  Systems  (see 
also  Obstacle  Detectors) :  35  ff,  49  ff, 
137  ff,  157  ff,  167  ff,  111  ff 
Braille: 

Automatic  Reproduction,  403  ff 
Displays,  Mechanical,  416  ff 
Enhancing  Availability,  410  ff 
Machine  Transcription,  393  ff 
Reading  Technique,  428  ff 
Transcription  and  Reproduction,  410  ff 
Brailler,  High  Speed  Electric:  420  ff 
Cane:  10,  13  ff 
Electronic,  177  ff 

Character  Recognition  (see  also  Reading 
Machines):  227  ff,  325  ff 
Contour  Tracing  Techniques,  240  ff 
Electrical  Scanning  Techniques,  233  ff 
Mechanical  Scanning  Techniques, 

231  ff 

Parallel  Sensing  Techniques,  237  ff 
Committee  on  Sensory  Devices:  170 
Compass,  Magnetic:  193  ff 
Demographic  and  Social  Aspects  of 
Blindness:  481  ff 
Direction  Finder:  199  ff 
Dog  Guides:  10 
Dolphin:  46  ff 
Echolocation:  35  ff,  51  ff 
Electroencephalography:  364  ff 
Guidance  Devices — See  Obstacle  Detec¬ 
tors,  Travel  Aids 
Mobility  Requirements:  8  ff 
Moth/Bat  Interaction:  58  ff,  113  ff 
Obstacle  Detectors: 

Ambient  Light,  167,  187  ff 
British,  137  ff 
Design  Parameters,  188  ff 
Edge  Detector,  183  ff 
Elektroftalm  (Polish),  157  ff 
History  of  (U.S.)  Research,  7  ff 
Infrared,  169,  191 

Optar  (OPTical  Automatic  Ranging), 
169 

Optical  Triangulation,  169 
Parallax  Detection,  183  ff 


Photodetector,  171 
Radar,  168  ff 

Signal  Corps  Obstacle  Detector,  169 
Swedish,  187  ff 

Tactile  Stimulator  Output,  170 
Ultrasonic,  168 

Office  of  Vocational  Rehabilitation: 

497  ff 

Oil  Bird  ( Steatornis ) :  46 
Orientation  Device:  199  ff 
Pattern  Recognition:  279  ff 
Porpoise:  46  ff 

Reading  Habits  and  Needs  of  the  Blind: 
412  ff 

Reading  Machines  (see  also  Character 
Recognition):  208  ff,  381  ff 
Auditory  Output,  289  ff 
Battelle,  343  ff 

Battelle  Training  Procedures,  355  ff 
Braille  Input,  325  ff 
Burroughs,  236 
Design  Criteria,  205  ff 
European,  261  ff 

Farrington  (Intelligent  Machines  Re¬ 
search  Co.),  229  ff 
Flying  Spot  Scanners,  233  ff 
Functional  Concepts,  215  ff 
Harmon  Moving  Photocell  Design, 

231  ff 

Logic  Designs,  245  ff 

Optophone^  291  ff,  325  ff,  343  ff 

Output  Codes,  218  ff,  291  ff,  431  ff 

Perceptron,  238 

RCA  Type  A-2  Reader,  345  ff 

Speech  Output,  294  ff 

Speech  Output  Problems,  455  ff 

Spelled  Speech  Output,  305  ff,  325  ff 

Tonal  Morse  Output,  325 

Ste-Re  System,  427  ff 

Visagraph,  345  ff 

Zworykin,  237 

Relief  Drawing  Techniques:  371  ff 
Sensory  Compensation:  363  ff 
Soviet  Union:  361 
Spatial  Orientation:  364 


515 


516 


Index 


Speech: 

Synthesis,  297  ff,  455  ff 
Time  Compression,  333  ff 
Transmission,  292  ff 
Step-Down  Detection  (see  also  Obstacle 
Detectors):  177 
Tactual/Kinesthetic: 

Communications  System,  312  ff 


Perception,  309  ff,  371  ff 
Travel  Aid:  167  ff,  199  ff 
Ultrasonic  Radiation  by  Bats:  36  ff, 
50  ff,  137  ff 

Use  of  Remaining  Sensory  Channels: 
362  ff 

Vari-Vox  (Doppler  Compressing  Ma¬ 
chine)  :  333  ff 


cT  & 

‘  i 

AMERICAN  FOUNDATION  FOR  THE  BLIND 

AUTHOR 

Proceedings  of  the  Inter- 

n 

T 

ational  Congress  on 
echnoloov  and  Blindness 

DATE  DUE 

BORROWER’S  NAME 

DATE 

RETURNED 

/r 


AMEICAN  FOUNDATION 
FOR  THE  BLIND 

Proceedings  of  the 
International  Congress  on  c.l 
Technology  and  Blindness 


PROFESSIONAL  LIBRARY 

American  printing  house  for  the  blind! 

LOUISVILLE.  KENTUCKY 


